Defective matrix: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Addbot
m Bot: Migrating 1 interwiki links, now provided by Wikidata on d:q5251123
en>Anita5192
Cited Golub & Van Loan
 
Line 1: Line 1:
The '''dead-end elimination''' algorithm '''(DEE)''' is a method for [[optimization (mathematics)|minimizing]] a function over a discrete set of independent variables.  The basic idea is to identify "dead ends", i.e., "bad" combinations of variables that cannot possibly yield the global minimum and to refrain from searching such combinations further. Hence, dead-end elimination is a mirror image of [[dynamic programming]], in which "good" combinations are identified and explored further. Although the method itself is general, it has been developed and applied mainly to the problems of [[protein structure prediction|predicting]] and [[protein design|designing]] the structures of [[protein]]s. The original description and proof of the dead-end elimination theorem can be found in {{ref|Desmet}}.
Hello and welcome. My title is Irwin and I totally dig that title. One of the extremely very best things in [http://netwk.hannam.ac.kr/xe/data_2/38191 over the counter std test] globe for me is to do aerobics and I've been doing it for fairly a whilst. Hiring is his profession. South Dakota is exactly where me and my spouse reside and my family members enjoys it.
 
==Basic requirements==
An effective DEE implementation requires four pieces of information:
# A well-defined finite set of discrete independent variables
# A precomputed numerical value (considered the "energy") associated with each element in the set of variables (and possibly with their pairs, triples, etc.)
# A criterion or criteria for determining when an element is a "dead end", that is, when it cannot possibly be a member of the solution set
# An [[objective function]] (considered the "energy function") to be minimized
 
Note that the criteria can easily be reversed to identify the maximum of a given function as well.
 
==Applications to protein structure prediction==
Dead-end elimination has been used effectively to predict the structure of side chains on a given [[tertiary structure|protein backbone structure]] by minimizing an energy function <math>E</math>.  The [[dihedral angle]] search space of the side chains is restricted to a discrete set of [[rotamer]]s for each [[amino acid]] position in the protein (which is, obviously, of fixed length). The original DEE description included criteria for the elimination of single rotamers and of rotamer pairs, although this can be expanded.
 
 
In the following discussion, let <math>N</math> be the length of the protein and let <math>r_{k}</math> represent the rotamer of the <math>\mathrm{k^{th}}</math> side chain.  Since atoms in proteins are assumed to interact only by two-body [[potential]]s, the energy may be written
 
:<math>
E_{TOT} = \sum_{k} E_{k}(r_{k}) + \sum_{k \neq l} E_{kl}(r_{k}, r_{l})\,
</math>
 
 
Where <math>E_{k}(r_{k})</math> represents the "self-energy" of a particular rotamer <math>r_{k}</math>, and <math>E_{kl}(r_{k}, r_{l})</math> represents the "pair energy" of the rotamers <math>r_{k}, r_{j}</math>.
 
 
Also note that <math>E_{kk}(r_{k}^{A}, r_{k}^{A})</math> (that is, the pair energy between a rotamer and itself) is taken to be zero, and thus does not affect the summations. This notation simplifies the description of the pairs criterion below.
 
===Singles elimination criterion===
 
If a particular rotamer <math>r_{k}^{A}</math> of sidechain <math>k</math> cannot possibly give a better energy than another rotamer <math>r_{k}^{B}</math> of the same sidechain, then rotamer A can be eliminated from further consideration, which reduces the search space.  Mathematically, this condition is expressed by the inequality
 
:<math>
E_{k}(r_{k}^{A}) + \sum_{l=1}^{N} \min_{X} E_{kl}(r_{k}^{A}, r_{l}^{X}) > E_{k}(r_{k}^{B}) + \sum_{l=1}^{N} \max_{X} E_{kl}(r_{k}^{B}, r_{l}^{X})
</math>
 
 
where <math>\min_{X} E_{kl}(r_{k}^{A}, r_{l}^{X})</math> is the minimum (best) energy possible between rotamer <math>r_{k}^{A}</math> of sidechain <math>k</math> and ''any'' rotamer X of side chain <math>l</math>.  Similarly, <math>\max_{X} E_{kl}(r_{k}^{B}, r_{l}^{X})</math> is the maximum (worst) energy possible between rotamer <math>r_{k}^{B}</math> of sidechain <math>k</math> and ''any'' rotamer X of side chain <math>l</math>.
 
===Pairs elimination criterion===
 
The pairs criterion is more difficult to describe and to implement, but it adds significant eliminating power. For brevity, we define the shorthand variable <math>U_{kl}^{AB}</math> that is the ''intrinsic'' energy of a pair of rotamers <math>A</math> and <math>B</math> at positions <math>k</math> and <math>l</math>, respectively
 
:<math>
U_{kl}^{AB} \ \stackrel{\mathrm{def}}{=}\  E_{k}(r_{k}^{A}) + E_{l}(r_{l}^{B}) + E_{kl}(r_{k}^{A}, r_{l}^{B})
</math>
 
A given pair of rotamers <math>A</math> and <math>B</math> at positions <math>k</math> and <math>l</math>, respectively, cannot ''both'' be in the final solution (although one or the other may be) if there is another pair <math>C</math> and <math>D</math> that always gives a better energy. Expressed mathematically,
 
:<math>
U_{kl}^{AB} + \sum_{i=1}^{N} \min_{X} \left(E_{ki}(r_{k}^{A}, r_{i}^{X}) + E_{lj}(r_{l}^{B}, r_{j}^{X})\right) > U_{kl}^{CD} + \sum_{i=1}^{N} \max_{X} \left(E_{ki}(r_{k}^{C}, r_{i}^{X}) + E_{lj}(r_{l}^{D}, r_{j}^{X})\right)
</math>
 
where <math>A \neq C</math>, <math>B \neq D</math> and <math>k \neq l</math>.
 
===Energy matrices===
For large <math>N</math>, the matrices of precomputed energies can become costly to store. Let <math>N</math> be the number of amino acid positions, as above, and let <math>p</math> be the number of rotamers at each position (this is usually, but not necessarily, constant over all positions). Each self-energy matrix for a given position requires <math>p</math> entries, so the total number of self-energies to store is <math>Np</math>. ''Each'' pair energy matrix between two positions <math>r_{k}</math> and <math>r_{l}</math>, for <math>p</math> discrete rotamers at each position, requires a <math>p \times p</math> matrix. This makes the total number of entries in an unreduced pair matrix <math>N^{2}p^{2}</math>. This can be trimmed somewhat, at the cost of additional complexity in implementation, because pair energies are symmetrical and the pair energy between a rotamer and itself is zero.
 
==Implementation and efficiency==
The above two criteria are normally applied iteratively until convergence, defined as the point at which no more rotamers or pairs can be eliminated. Since this is normally a reduction in the sample space by many orders of magnitude, simple enumeration will suffice to determine the minimum within this pared-down set.
 
Given this model, it is clear that the DEE algorithm is guaranteed to find the optimal solution; that is, it is a [[global optimization]] process. The single-rotamer search scales [[quadratic growth|quadratically]] in time with ''total'' number of rotamers. The pair search scales cubically and is the slowest part of the algorithm (aside from energy calculations). This is a dramatic improvement over the brute-force enumeration which scales as <math>O(p^{N})</math>.
 
A large-scale [[benchmark (computing)|benchmark]] of DEE compared with alternative methods of [[protein structure prediction]] and design finds that DEE reliably converges to the optimal solution for protein lengths for which it runs in a reasonable amount of time{{ref|Voigt}}. It significantly outperforms the alternatives under consideration, which involved techniques derived from [[mean field theory]], [[genetic algorithm]]s, and the [[Monte Carlo method]]. However, the other algorithms are appreciably faster than DEE and thus can be applied to larger and more complex problems; their relative accuracy can be extrapolated from a comparison to the DEE solution within the scope of problems accessible to DEE.
 
==Protein design==
{{main|Protein design}}
The preceding discussion implicitly assumed that the rotamers <math>r_{k}</math> are all different orientations of the same amino acid side chain. That is, the sequence of the protein was assumed to be fixed. It is also possible to allow multiple side chains to "compete" over a position <math>k</math> by including both types of side chains in the set of rotamers for that position. This allows a novel sequence to be designed onto a given protein backbone. A short [[zinc finger]] protein fold has been redesigned this way{{ref|Dahiyat}}. However, this greatly increases the number of rotamers per position and still requires a fixed protein length.
 
==Generalizations==
More powerful and more general criteria have been introduced that improve both the efficiency and the eliminating power of the method for both prediction and design applications. One example is a refinement of the singles elimination criterion known as the Goldstein criterion{{ref|Goldstein}}, which arises from fairly straightforward algebraic manipulation before applying the minimization:
 
:<math>
E_{k}(r_{k}^{A}) - E_{k}(r_{k}^{B}) + \sum_{l=1}^{N} \min_{X} \left(E_{kl}(r_{k}^{A}, r_{l}^{X}) - E_{kl}(r_{k}^{B}, r_{l}^{X})\right) > 0
</math>
 
Thus rotamer <math>r_{k}^{A}</math> can be eliminated if any alternative rotamer from the set at <math>r_{k}</math> contributes less to the total energy than <math>r_{k}^{A}</math>. This is an improvement over the original criterion, which requires comparison of the best possible (that is, the smallest) energy contribution from <math>r_{k}^{A}</math> with the ''worst'' possible contribution from an alternative rotamer.
 
An extended discussion of elaborate DEE criteria and a benchmark of their relative performance can be found in {{ref|Peirce}}.
 
==References==
 
# {{note|Desmet}} Desmet J, de Maeyer M, Hazes B, Lasters I. (1992). The dead-end elimination theorem and its use in protein side-chain positioning. ''Nature'', '''356''', 539-542.
# {{note|Voigt}} Voigt CA, Gordon DB, Mayo SL. (2000). Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design. ''J Mol Biol''  299(3):789-803.
# {{note|Dahiyat}} Dahiyat BI, Mayo SL. (1997). De novo protein design: fully automated sequence selection. ''Science'' 278(5335):82-7.
# {{note|Goldstein}} Goldstein RF. (1994). Efficient rotamer elimination applied to protein side-chains and related spin glasses. ''Biophys J'' 66(5):1335-40.
# {{note|Peirce}} Pierce NA, Spriet JA, Desmet J, Mayo SL. (2000). Conformational splitting: a more powerful criterion for dead-end elimination. ''J Comput Chem'' 21: 999-1009.
 
[[Category:Mathematical optimization]]
[[Category:Protein methods]]

Latest revision as of 04:38, 23 December 2014

Hello and welcome. My title is Irwin and I totally dig that title. One of the extremely very best things in over the counter std test globe for me is to do aerobics and I've been doing it for fairly a whilst. Hiring is his profession. South Dakota is exactly where me and my spouse reside and my family members enjoys it.