Catching a common fold

As the first proteins were sequenced in the 1950s, it was evident that they belonged to families. The determination of protein three-dimensional structures during the late 1960s and early 1970s (e.g., insulins, globins, and serine proteinases) confirmed that related proteins from different species adopt similar tertiary structures characteristic of each family. The sequence variations within a family reflected the restraints of the tertiary structures: apart from the catalytic or binding residues, invariant amino acids were most often in the protein core, inaccessible to solvent and with a key role in the protein architecture. The fascination with families of proteins was deepened with the realization that many proteins, with quite unrelated sequences, could adopt a common fold. Rossmann, Matthews, Branden, Richardson, and many others recognized similarities between the tertiary structures or domains that occur in many quite different proteins (Richardson, 1981); these included ap-nucleotide binding motifs (Rossmann fold), &-barrels (TIM barrel), &jelly rolls, four ahelix bundles, and immunoglobulin domains (0-Ig fold). These protein topologies underlined the fact that tertiary structures could be considered as simple combinations of secondary structural elements packed together in a limited number of ways: apa/3aO, mas, P@PP, and so on. It seemed that protein structures could be predicted from sequences by combinatorial assembly of the basic elements of secondary structure, following various rules about handedness of the loops connecting them and the avoidance of strands that were “cross-overs.” However, such combinatorial approaches to the protein folding problem depend on correct assignment of a-helices, 0strands, and coils, and this remains a formidable challenge.

[1]  S. Wodak,et al.  Modelling the polypeptide backbone with 'spare parts' from known protein structures. , 1989, Protein engineering.

[2]  G. Barton,et al.  Multiple protein sequence alignment from tertiary structure comparison: Assignment of global and residue confidence levels , 1992, Proteins.

[3]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[4]  G. Crippen,et al.  Contact potential that recognizes the correct folding of globular proteins. , 1992, Journal of molecular biology.

[5]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[6]  M. O. Dayhoff,et al.  Establishing homologies in protein sequences. , 1983, Methods in enzymology.

[7]  W. Taylor,et al.  Identification of protein sequence homology by consensus template alignment. , 1986, Journal of molecular biology.

[8]  P Argos,et al.  A sensitive procedure to compare amino acid sequences. , 1987, Journal of molecular biology.

[9]  G J Barton,et al.  Evaluation and improvements in the automatic alignment of protein sequences. , 1987, Protein engineering.

[10]  A. D. McLachlan,et al.  Secondary structure‐based profiles: Use of structure‐conserving scoring tables in searching protein sequence databases for structural similarities , 1991, Proteins.

[11]  John P. Overington,et al.  Alignment and searching for common protein folds using a data bank of structural templates. , 1993, Journal of molecular biology.

[12]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[13]  P. Argos,et al.  A data bank merging related protein structures and sequences. , 1992, Protein engineering.

[14]  M G Rossmann,et al.  Comparison of super-secondary structures in proteins. , 1973, Journal of molecular biology.

[15]  M. Sternberg,et al.  Flexible protein sequence patterns. A sensitive method to detect weak structural similarities. , 1990, Journal of molecular biology.

[16]  T. Blundell,et al.  Knowledge based modelling of homologous proteins, Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. , 1987, Protein engineering.

[17]  R. Doolittle Similar amino acid sequences: chance or common ancestry? , 1981, Science.

[18]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[19]  F. Richards,et al.  Identification of structural motifs from protein coordinate data: Secondary structure and first‐level supersecondary structure * , 1988, Proteins.

[20]  T L Blundell,et al.  A variable gap penalty function and feature weights for protein 3-D structure comparisons. , 1992, Protein engineering.

[21]  W R Taylor,et al.  A holistic approach to protein structure alignment. , 1989, Protein engineering.

[22]  M G Rossmann,et al.  The evolution of dehydrogenases and kinases. , 1975, CRC critical reviews in biochemistry.

[23]  M Karplus,et al.  Analysis of side-chain orientations in homologous proteins. , 1987, Journal of molecular biology.

[24]  John P. Overington,et al.  Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction , 1990, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[25]  M. Sternberg,et al.  A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. , 1987, Journal of molecular biology.

[26]  T. Blundell,et al.  Structure of porphobilinogen deaminase reveals a flexible multidomain polymerase with a single catalytic site , 1992, Nature.

[27]  John P. Overington,et al.  From comparisons of protein sequences and structures to protein modelling and design. , 1990, Trends in biochemical sciences.

[28]  T. L. Blundell,et al.  Knowledge-based prediction of protein structures and the design of novel molecules , 1987, Nature.

[29]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[30]  C. Sander,et al.  Detection of common three‐dimensional substructures in proteins , 1991, Proteins.

[31]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[32]  John P. Overington,et al.  Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables. , 1993, Journal of molecular biology.

[33]  W. Turnell,et al.  Relaxin has conformational homology with insulin , 1977, Nature.

[34]  C. Eigenbrot,et al.  X-ray structure of human relaxin at 1.5 A. Comparison to insulin and implications for receptor binding determinants. , 1991, Journal of molecular biology.

[35]  M Levitt,et al.  Alignment of the amino acid sequences of distantly related proteins using variable gap penalties. , 1986, Protein engineering.

[36]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[37]  M J Sternberg,et al.  Evaluation of the sequence template method for protein structure prediction. Discrimination of the (beta/alpha)8-barrel fold. , 1992, Journal of molecular biology.

[38]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[39]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[40]  M. Murthy,et al.  A fast method of comparing protein structures , 1984, FEBS letters.

[41]  T L Blundell,et al.  Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modelling. , 1987, Protein engineering.

[42]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .

[43]  R. Doolittle Molecular evolution: computer analysis of protein and nucleic acid sequences. , 1990, Methods in enzymology.

[44]  M G Rossmann,et al.  Comparison of protein structures. , 1985, Methods in enzymology.

[45]  T L Blundell,et al.  Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. , 1987, Protein engineering.

[46]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[47]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[48]  John P. Overington,et al.  Environment‐specific amino acid substitution tables: Tertiary templates and prediction of protein folds , 1992, Protein science : a publication of the Protein Society.

[49]  T L Blundell,et al.  Phylogenetic relationships from three-dimensional protein structures. , 1990, Methods in enzymology.

[50]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[51]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.