Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to side-chain contacts secondary structure and accessibility.

Side-chain to side-chain contacts, accessibility, secondary structure and RMS deviation were compared within 607 pairs of proteins having similar three-dimensional (3D) structures. Three types of protein 3D structural similarities were defined: type A having sequence and usually functional similarity; type B having functional, but no sequence similarity; and type C having only 3D structural similarity. Within proteins having little or no sequence similarity (types B and C), structural features frequently had a degree of conservation comparable to dissimilar 3D structures. Despite similar protein folds, as few as 30% of residues within similar protein 3D structures can form a common core. RMS deviations on core C alpha atoms can be as high as 3.2 A. Similar protein structures can have secondary structure identities as low as 41%, which is equivalent to that expected by chance. By defining three categories of amino acid accessibility (buried, half buried and exposed), some similar protein 3D structures have as few as 30% of positions in the same category, making them indistinguishable from pairs of dissimilar protein structures. Similar structures can also have as few as 12% of common side-chain to side-chain contacts, and virtually no similar energetically favourable side-chain to side-chain interactions. Complementary changes are defined as structurally equivalent pairs of interacting residues in two structures with energetically favourable but different side-chain interactions. For many proteins with similar three-dimensional structures, the proportion of complementary changes is near to that expected by chance, suggesting that many similar structures have fundamentally different stabilising interactions. All of the results suggest that proteins having similar 3D structures can have little in common apart from a scaffold of core secondary structures. This has profound implications for methods of protein fold detection, since many of the properties assumed to be conserved across similar protein 3D structures (e.g. accessibility, side-chain to side-chain contacts, etc.) are often unconserved within weakly similar (i.e. type B and C) protein 3D structures. Little difference was found between type B and C similarities suggesting that the structure of similar proteins can evolve beyond recognition even when function is conserved. Our findings suggest that it is more general features of protein structure, such as the requirements for burial of hydrophobic residues and exposure of polar residues, rather than specific residue-residue interactions that determine how well a particular sequence adopts a particular fold.(ABSTRACT TRUNCATED AT 400 WORDS)

[1]  W. Lim,et al.  Alternative packing arrangements in the hydrophobic core of lambda repressor. , 1989, Nature.

[2]  P. K. Warme,et al.  A survey of amino acid side-chain interactions in 21 proteins. , 1978, Journal of molecular biology.

[3]  K. Hatrick,et al.  Compensating changes in protein multiple sequence alignments. , 1994, Protein engineering.

[4]  P Argos,et al.  Protein sequence comparison: methods and significance. , 1991, Protein engineering.

[5]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[6]  J. Thornton,et al.  Stereochemical quality of protein structure coordinates , 1992, Proteins.

[7]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[8]  John P. Overington,et al.  Environment‐specific amino acid substitution tables: Tertiary templates and prediction of protein folds , 1992, Protein science : a publication of the Protein Society.

[9]  Y. Matsuo,et al.  Development of pseudoenergy potentials for assessing protein 3-D-1-D compatibility and detecting weak homologies. , 1993, Protein engineering.

[10]  E G Hutchinson,et al.  The Greek key motif: extraction, classification and analysis. , 1993, Protein engineering.

[11]  G. Barton,et al.  Multiple protein sequence alignment from tertiary structure comparison: Assignment of global and residue confidence levels , 1992, Proteins.

[12]  Chris Sander,et al.  Globin fold in a bacterial toxin , 1993, Nature.

[13]  M. Sippl Calculation of conformational ensembles from potentials of mena force , 1990 .

[14]  W. Taylor,et al.  Identification of protein sequence homology by consensus template alignment. , 1986, Journal of molecular biology.

[15]  Takashi Takagi An α/gb-barrel full of evolutionary trouble , 1993 .

[16]  E. Neher How frequent are correlated changes in families of protein sequences? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[17]  C Sander,et al.  Structural alignment of globins, phycocyanins and colicin A , 1993, FEBS letters.

[18]  A M Lesk,et al.  Evolution of proteins formed by beta-sheets. II. The core of the immunoglobulin domains. , 1982, Journal of molecular biology.

[19]  G. Barton Protein multiple sequence alignment and flexible pattern matching. , 1990, Methods in enzymology.

[20]  M. Sippl,et al.  Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a data base of known protein conformations , 1992, Proteins.

[21]  W R Taylor,et al.  A template based method of pattern matching in protein sequences. , 1989, Progress in biophysics and molecular biology.

[22]  George D. Rose,et al.  The Hydrophobicity Profile , 1989 .

[23]  C Sander,et al.  Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. , 1993, Journal of molecular biology.

[24]  G. Chelvanayagam,et al.  Anatomy and evolution of proteins displaying the viral capsid jellyroll topology. , 1992, Journal of molecular biology.

[25]  E. Adman,et al.  Structure and Function of Small Blue Copper Proteins , 1985 .

[26]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[27]  M. Karplus,et al.  An analysis of incorrectly folded protein models. Implications for structure predictions. , 1984, Journal of molecular biology.

[28]  David T. Jones,et al.  Recurrence of a binding motif? , 1993, Nature.

[29]  W. Taylor,et al.  The classification of amino acid conservation. , 1986, Journal of theoretical biology.

[30]  P. Argos,et al.  Suggestions for "safe" residue substitutions in site-directed mutagenesis. , 1991, Journal of molecular biology.

[31]  Janet M. Thornton,et al.  Prediction of progress at last , 1991, Nature.

[32]  A M Lesk,et al.  Evolution of proteins formed by beta-sheets. I. Plastocyanin and azurin. , 1982, Journal of molecular biology.

[33]  S Henikoff,et al.  Performance evaluation of amino acid substitution matrices , 1993, Proteins.

[34]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[35]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.

[36]  David Eisenberg,et al.  Inverted protein structure prediction , 1993 .

[37]  T. P. Flores,et al.  Identification and classification of protein fold families. , 1993, Protein engineering.

[38]  M. Sternberg,et al.  A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. , 1987, Journal of molecular biology.

[39]  A. Lesk,et al.  Structural alignment and analysis of two distantly related proteins: Aplysia limacina myoglobin and sea lamprey globin , 1988, Proteins.

[40]  P Argos,et al.  Evolution of protein cores. Constraints in point mutations as observed in globin tertiary structures. , 1990, Journal of molecular biology.

[41]  A study of structural determinants in the interleukin-1 fold. , 1993, Protein engineering.

[42]  P. Argos,et al.  Prediction of secondary structural elements in glycerol-3-phosphate dehydrogenase by comparison with other dehydrogenases. , 1980, European journal of biochemistry.

[43]  A. Lesk,et al.  Determinants of a protein fold. Unique features of the globin amino acid sequences. , 1987, Journal of molecular biology.

[44]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[45]  Shoshana J. Wodak,et al.  Generating and testing protein folds , 1993 .

[46]  P. Argos,et al.  A data bank merging related protein structures and sequences. , 1992, Protein engineering.

[47]  A. Murzin OB(oligonucleotide/oligosaccharide binding)‐fold: common structural and functional solution for non‐homologous sequences. , 1993, The EMBO journal.

[48]  C. Sander,et al.  Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? , 1994, Protein engineering.

[49]  J. Richardson,et al.  The toxin-agglutinin fold. A new group of small protein structures organized around a four-disulfide core. , 1980, The Journal of biological chemistry.

[50]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[51]  P Willett,et al.  Use of techniques derived from graph theory to compare secondary structure motifs in proteins. , 1990, Journal of molecular biology.

[52]  M J Sippl,et al.  Assembly of polypeptide and protein backbone conformations from low energy ensembles of short fragments: Development of strategies and construction of models for myoglobin, lysozyme, and thymosin β4 , 1992, Protein science : a publication of the Protein Society.

[53]  Michael G. Rossmann,et al.  Chemical and biological evolution of a nucleotide-binding protein , 1974, Nature.

[54]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[55]  A. Godzik,et al.  Topology fingerprint approach to the inverse protein folding problem. , 1992, Journal of molecular biology.

[56]  T. Blundell,et al.  Four-fold structural repeat in the acid proteases. , 1979, Biochimica et biophysica acta.

[57]  W R Taylor,et al.  Towards protein tertiary fold prediction using distance and motif constraints. , 1991, Protein engineering.

[58]  A. D. McLachlan,et al.  Secondary structure‐based profiles: Use of structure‐conserving scoring tables in searching protein sequence databases for structural similarities , 1991, Proteins.

[59]  P Argos,et al.  Exploring structural homology of proteins. , 1976, Journal of molecular biology.

[60]  A. Lesk,et al.  How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. , 1980, Journal of molecular biology.

[61]  T. P. Flores,et al.  Comparison of conformational characteristics in structurally similar protein pairs , 1993, Protein science : a publication of the Protein Society.

[62]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[63]  W. Rutter,et al.  Splice junctions: association with variation in protein structure. , 1983, Science.

[64]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[65]  John P. Overington,et al.  Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction , 1990, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[66]  M. O. Dayhoff A model of evolutionary change in protein , 1978 .

[67]  Robert B. Russell,et al.  An SH2—SH3 domain hybrid , 1993, Nature.

[68]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[69]  N. D. Clarke,et al.  Identification of protein folds: Matching hydrophobicity patterns of sequence sets with solvent accessibility patterns of known structures , 1990, Proteins.

[70]  A. Lesk,et al.  beta-Trefoil fold. Patterns of structure and sequence in the Kunitz inhibitors interleukins-1 beta and 1 alpha and fibroblast growth factors. , 1992, Journal of molecular biology.

[71]  R. Bruccoleri,et al.  Criteria that discriminate between native proteins and incorrectly folded models , 1988, Proteins.

[72]  A M Lesk,et al.  Interior and surface of monomeric proteins. , 1987, Journal of molecular biology.

[73]  John P. Overington,et al.  X-ray analysis of HIV-1 proteinase at 2.7 Å resolution confirms structural homology among retroviral enzymes , 1989, Nature.

[74]  A M Lesk,et al.  Comparison of the structures of globins and phycocyanins: Evidence for evolutionary relationship , 1990, Proteins.

[75]  G. Barton,et al.  The limits of protein secondary structure prediction accuracy from multiple sequence alignment. , 1993, Journal of molecular biology.

[76]  S J Remington,et al.  The alpha/beta hydrolase fold. , 1992, Protein engineering.

[77]  John P. Overington,et al.  Alignment and searching for common protein folds using a data bank of structural templates. , 1993, Journal of molecular biology.

[78]  M. Sternberg,et al.  Flexible protein sequence patterns. A sensitive method to detect weak structural similarities. , 1990, Journal of molecular biology.

[79]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[80]  T. Sixma,et al.  Comparison of the B-pentamers of heat-labile enterotoxin and verotoxin-1: two structures with remarkable similarity and dissimilarity. , 1993, Biochemistry.

[81]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[82]  A. Mclachlan Gene duplications in the structural evolution of chymotrypsin. , 1979, Journal of molecular biology.

[83]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.