The underlying premise behind all attempts to determine a large number of diverse protein structures is that the total number of protein domain folds is much smaller, by many orders of magnitude, than the total number of sequences; in other words, many sequences adopt essentially the same fold. If the fold of a protein could be recognized from sequence information alone, then a complete database of all possible folds would allow the structure corresponding to any sequence to be modeled. The growth of structure determination has turned most biochemists and biologists into consumers of structural information. As the demand for such information continues to outstrip the supply, all aspects of structure modeling assume increasing importance. This unit provides an introduction to modeling structure from its sequence and surveys the currently available methods described in the subsequent units of this chapter.
[1]
N. Grishin,et al.
Exploring dynamics of protein structure determination and homology-based prediction to estimate the number of superfamilies and folds
,
2006,
BMC Structural Biology.
[2]
A. Joachimiak,et al.
Structure- and Function-based Characterization of a New Phosphoglycolate Phosphatase from Thermoplasma acidophilum*
,
2004,
Journal of Biological Chemistry.
[3]
M. Gerstein,et al.
Annotation transfer for genomics: measuring functional divergence in multi-domain proteins.
,
2001,
Genome research.
[4]
Sung-Hou Kim,et al.
Global mapping of the protein structure space and application in structure-based inference of protein function.
,
2005,
Proceedings of the National Academy of Sciences of the United States of America.