The recognition of protein structure and function from sequence: adding value to genome data.

The explosion of DNA sequence data from genome projects presents many challenges. For instance, we must extend our current knowledge of protein structure and function so that it can be applied to these new sequences. The derivation of rules for the relationships between sequence and structure allow us to recognize a common fold by the use of tertiary templates. New techniques enable us to begin to meet the challenge of rule-based modelling of distantly related proteins. This paper describes an integrated and knowledge-based approach to the prediction of protein structure and function which can maximize the value of sequence information.

[1]  John P. Overington,et al.  Environment‐specific amino acid substitution tables: Tertiary templates and prediction of protein folds , 1992, Protein science : a publication of the Protein Society.

[2]  T. Blundell,et al.  Knowledge based modelling of homologous proteins, Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. , 1987, Protein engineering.

[3]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[4]  Janet M. Thornton,et al.  Lessons from analyzing protein structures , 1992 .

[5]  T L Blundell,et al.  Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. II. Secondary structures. , 1994, Journal of molecular biology.

[6]  S. Bryant,et al.  New Programs for Protein Tertiary Structure Prediction , 1993, Bio/Technology.

[7]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[8]  K. D. Hardman,et al.  Structure of concanavalin A at 2.4-A resolution. , 1972, Biochemistry.

[9]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[10]  F. L. Suddath,et al.  The crystal structure of pea lectin at 3.0-A resolution. , 1986, The Journal of biological chemistry.

[11]  C. Chothia,et al.  Protein architecture: New superfamilies , 1992, Current Biology.

[12]  S V Evans,et al.  SETOR: hardware-lighted three-dimensional solid model representations of macromolecules. , 1993, Journal of molecular graphics.

[13]  G. Louie Porphobilinogen deaminase and its structural similarity to the bidomain binding proteins , 1993 .

[14]  T. Blundell,et al.  Topological similarities in TGF-beta 2, PDGF-BB and NGF define a superfamily of polypeptide growth factors. , 1993, Structure.

[15]  T L Blundell,et al.  An evaluation of the performance of an automated procedure for comparative modelling of protein tertiary structure. , 1993, Protein engineering.

[16]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[17]  David Eisenberg,et al.  Inverted protein structure prediction , 1993 .

[18]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[19]  John P. Overington,et al.  Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction , 1990, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[20]  Nicholas Short The changing shape of structure , 1993, Nature.

[21]  R. Doolittle Similar amino acid sequences: chance or common ancestry? , 1981, Science.

[22]  John P. Overington,et al.  Molecular recognition in protein families: a database of aligned three-dimensional structures of related proteins. , 1993, Biochemical Society transactions.

[23]  P N Goodfellow,et al.  DNA binding activity of recombinant SRY from normal males and XY females. , 1992, Science.

[24]  U. Heinemann,et al.  Molecular and active-site structure of a Bacillus 1,3-1,4-beta-glucanase. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Karl D. Hardman,et al.  Structure of concanavalin A at 2.4-Ang resolution , 1972 .

[26]  T. L. Blundell,et al.  Knowledge-based prediction of protein structures and the design of novel molecules , 1987, Nature.

[27]  T L Blundell,et al.  A variable gap penalty function and feature weights for protein 3-D structure comparisons. , 1992, Protein engineering.

[28]  T. Blundell,et al.  Catching a common fold , 1993, Protein science : a publication of the Protein Society.

[29]  W. Hendrickson,et al.  A structural superfamily of growth factors containing a cystine knot motif , 1993, Cell.

[30]  B. Dujon,et al.  The complete DNA sequence of yeast chromosome III , 1992, Nature.

[31]  T L Blundell,et al.  Packing of secondary structural elements in proteins. Analysis and prediction of inter-helix distances. , 1993, Journal of molecular biology.

[32]  D. Davies,et al.  Crystal structure of transforming growth factor-beta 2: an unusual fold for the superfamily. , 1992, Science.

[33]  John Maddox Ever-longer sequences in prospect , 1992, Nature.

[34]  T. Blundell,et al.  Comparisons of the sequences, 3-D structures and mechanisms of pepsin-like and retroviral aspartic proteinases. , 1991, Advances in experimental medicine and biology.

[35]  Tom L. Blundell,et al.  New protein fold revealed by a 2.3-Å resolution crystal structure of nerve growth factor , 1991, Nature.

[36]  A C May,et al.  Protein structure comparisons using a combination of a genetic algorithm, dynamic programming and least-squares minimization. , 1994, Protein engineering.

[37]  J Bajorath,et al.  Knowledge‐based model building of proteins: Concepts and examples , 1993, Protein science : a publication of the Protein Society.

[38]  T L Blundell,et al.  Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. , 1987, Protein engineering.

[39]  John P. Overington,et al.  A structural basis for sequence comparisons. An evaluation of scoring methodologies. , 1993, Journal of molecular biology.

[40]  T. Blundell,et al.  Structure of pentameric human serum amyloid P component , 1994, Nature.

[41]  Chris Sander,et al.  What's in a genome? , 1992, Nature.

[42]  T L Blundell,et al.  Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modelling. , 1987, Protein engineering.

[43]  M. Grütter,et al.  An unusual feature revealed by the crystal structure at 2.2 Å resolution of human transforming growth fact or-β2 , 1992, Nature.

[44]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .

[45]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[46]  T. Blundell,et al.  Knowledge-based protein modeling. , 1994, Critical reviews in biochemistry and molecular biology.

[47]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[48]  John P. Overington,et al.  Alignment and searching for common protein folds using a data bank of structural templates. , 1993, Journal of molecular biology.

[49]  T L Blundell,et al.  Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. I. Solvent accessibility classes. , 1994, Journal of molecular biology.

[50]  John P. Overington Comparison of three-dimensional structures of homologous proteins , 1992, Current Biology.

[51]  Christine A. Orengo,et al.  A Review of Methods for Protein Structure Comparison , 1992 .

[52]  William R. Taylor,et al.  A structural model for the retroviral proteases , 1987, Nature.

[53]  Mark S. Johnson Comparison of protein structures , 1991 .

[54]  John P. Overington,et al.  From comparisons of protein sequences and structures to protein modelling and design. , 1990, Trends in biochemical sciences.