Protein folds, functions and evolution.

The evolution of proteins and their functions is reviewed from a structural perspective in the light of the current database. Protein domain families segregate unequally between the three major classes, the 32 different architectures and almost 700 folds observed to date. We find that the number of new topologies is still increasing, although 25 new structures are now determined for each new topology. The corresponding analysis and classification of function is only just beginning, fuelled by the genome data. The structural data revealed unexpected conservations and divergence of function both within and between families. The next five years will see the compilation of a definitive dictionary of protein families and their related functions, based on structural data which reveals relationships hidden at the sequence level. Such information will provide the foundation to build a better understanding of the molecular basis of biological complexity and hopefully to facilitate rational molecular design.

[1]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[2]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[3]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[4]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[5]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[6]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[7]  John P. Overington,et al.  Molecular recognition in protein families: a database of aligned three-dimensional structures of related proteins. , 1993, Biochemical Society transactions.

[8]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[9]  Janet M. Thornton,et al.  Protein domain superfolds and superfamilies , 1994 .

[10]  Tipton Kf,et al.  Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme nomenclature. Recommendations 1992. Supplement: corrections and additions. , 1994 .

[11]  C. Sander,et al.  Parser for protein folding units , 1994, Proteins.

[12]  G J Barton,et al.  Continuous and discontinuous domains: An algorithm for the automatic generation of reliable protein domain definitions , 1995, Protein science : a publication of the Protein Society.

[13]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[14]  M J Sternberg,et al.  Identification and analysis of domains in proteins. , 1995, Protein engineering.

[15]  M B Swindells,et al.  A procedure for detecting structural domains in proteins , 1995, Protein science : a publication of the Protein Society.

[16]  T L Blundell,et al.  A database of globular protein structural domains: clustering of representative family members into similar folds. , 1996, Folding & design.

[17]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[18]  J M Thornton,et al.  Analysis of domain structural class using an automated class assignment protocol. , 1996, Journal of molecular biology.

[19]  S H Bryant,et al.  A dynamic look at structures: WWW-Entrez and the Molecular Modeling Database. , 1996, Trends in biochemical sciences.

[20]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[21]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[22]  M Gerstein,et al.  A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. , 1997, Journal of molecular biology.

[23]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[24]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[25]  C. Chothia,et al.  Structural assignments to the Mycoplasma genitalium proteins show extensive gene duplications and domain rearrangements. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[26]  J M Thornton,et al.  Domain assignment for protein structures using a consensus approach: Characterization and analysis , 1998, Protein science : a publication of the Protein Society.

[27]  J M Thornton,et al.  Sequences annotated by structure: a tool to facilitate the use of structural information in sequence analysis. , 1998, Protein engineering.

[28]  C. Chothia,et al.  Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[29]  John P. Overington,et al.  HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.

[30]  C Sander,et al.  Dictionary of recurrent domains in protein structures , 1998, Proteins.

[31]  C. Orengo,et al.  Protein folds and functions. , 1998, Structure.

[32]  ECOLI SODF,et al.  Analogous Enzymes : Independent Inventions in Enzyme Evolution , 1998 .

[33]  A. Murzin How far divergent evolution goes in proteins. , 1998, Current opinion in structural biology.

[34]  C. Orengo,et al.  Correlation of observed fold frequency with the occurrence of local structural motifs. , 1999, Journal of molecular biology.

[35]  M. Gerstein,et al.  The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. , 1999, Journal of molecular biology.

[36]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..