On the origin and highly likely completeness of single-domain protein structures.

The size and origin of the protein fold universe is of fundamental and practical importance. Analyzing randomly generated, compact sticky homopolypeptide conformations constructed in generic simplified and all-atom protein models, all have similar folds in the library of solved structures, the Protein Data Bank, and conversely, all compact, single-domain protein structures in the Protein Data Bank have structural analogues in the compact model set. Thus, both sets are highly likely complete, with the protein fold universe arising from compact conformations of hydrogen-bonded, secondary structures. Because side chains are represented by their Cbeta atoms, these results also suggest that the observed protein folds are insensitive to the details of side-chain packing. Sequence specificity enters both in fine-tuning the structure and thermodynamically stabilizing a given fold with respect to the set of alternatives. Scanning the models against a three-dimensional active-site library, close geometric matches are frequently found. Thus, the presence of active-site-like geometries also seems to be a consequence of the packing of compact, secondary structural elements. These results have significant implications for the evolution of protein structure and function.

[1]  F. T. Wall,et al.  Principles of Polymer Chemistry. Paul J. Flory.Cornell Univ. Press, Ithaca, New York, 1953. 688pp. Illus. $8.50 , 1954 .

[2]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[3]  O. Ptitsyn,et al.  Similarities of protein topologies: evolutionary divergence, functional convergence or principles of folding? , 1980, Quarterly Reviews of Biophysics.

[4]  A V Finkelstein,et al.  The classification and origins of protein folding patterns. , 1990, Annual review of biochemistry.

[5]  Alexei V. Finkelstein,et al.  A search for the most stable folds of protein chains , 1991, Nature.

[6]  F E Cohen,et al.  Protein folding. Effect of packing density on chain conformation. , 1991, Journal of molecular biology.

[7]  F. Richards,et al.  Construction of new ligand binding sites in proteins of known structure. I. Computer-aided modeling of sites with pre-defined geometry. , 1991, Journal of molecular biology.

[8]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[9]  A. Ravve,et al.  Principles of Polymer Chemistry , 1995 .

[10]  C. Orengo,et al.  Evolution of protein function, from a structural perspective. , 1999, Current opinion in chemical biology.

[11]  C. M. Summa,et al.  INAUGURAL ARTICLE by a Recently Elected Academy Member:Retrostructural analysis of metalloproteins: Application to the design of a minimal model for diiron proteins , 2000 .

[12]  P E Bourne,et al.  The Protein Data Bank. , 2002, Nucleic acids research.

[13]  P E Bourne,et al.  An alternative view of protein fold space , 2000, Proteins.

[14]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[15]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. , 2000, Journal of molecular biology.

[16]  E. Shakhnovich,et al.  The folding thermodynamics and kinetics of crambin using an all-atom Monte Carlo simulation. , 2000, Journal of molecular biology.

[17]  Frances M. G. Pearl,et al.  Quantifying the similarities within fold space. , 2002, Journal of molecular biology.

[18]  Eugene I Shakhnovich,et al.  Expanding protein universe and its origin from the biological Big Bang , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[19]  S. Burley,et al.  Structuring the universe of proteins. , 2002, Annual review of genomics and human genetics.

[20]  J. Skolnick,et al.  The PDB is a covering set of small protein structures. , 2003, Journal of molecular biology.

[21]  James E. Bray,et al.  A practical and robust sequence search strategy for structural genomics target selection , 2004, Bioinform..

[22]  Flavio Seno,et al.  Geometry and symmetry presculpt the free-energy landscape of proteins. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Patrick Aloy,et al.  Ten thousand interactions for the molecular biologist , 2004, Nature Biotechnology.

[24]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[26]  Yang Zhang,et al.  SPICKER: A clustering approach to identify near‐native protein folds , 2004, J. Comput. Chem..

[27]  Yang Zhang,et al.  Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment , 2004, Bioinform..

[28]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[29]  Wei Yang,et al.  Design of a calcium-binding protein with desired structure in a cell adhesion molecule. , 2005, Journal of the American Chemical Society.

[30]  Sung-Hou Kim,et al.  Global mapping of the protein structure space and application in structure-based inference of protein function. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[31]  K. Gardner,et al.  Identification and optimization of protein domains for NMR studies. , 2005, Methods in enzymology.

[32]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  S. Brenner,et al.  Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches , 2004, Proteins.

[34]  Eugene I Shakhnovich,et al.  Nucleation and the transition state of the SH3 domain. , 2005, Journal of molecular biology.