Identification of functionally conserved residues with the use of entropy–variability plots

We introduce sequence entropy–variability plots as a method of analyzing families of protein sequences, and demonstrate this for three well‐known sequence families: globins, ras‐like proteins, and serine‐proteases. The location of an aligned residue position in the entropy–variability plot correlates with structural characteristics, and with known facts about the roles of individual amino acids in the function of these proteins. The large numbers of known sequences in these families allowed us to introduce new filtering methods for variability patterns. The results are discussed in terms of a simple evolutionary model for functional proteins. Proteins 2003;52:544–552. © 2003 Wiley‐Liss, Inc.

[1]  Melissa S. Cline,et al.  Predicting reliable regions in protein sequence alignments , 2002, Bioinform..

[2]  L. Mirny,et al.  Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. , 1999, Journal of molecular biology.

[3]  Shmuel Pietrokovski,et al.  The Blocks database--a system for protein classification , 1996, Nucleic Acids Res..

[4]  R. Russell,et al.  Analysis and prediction of functional sub-types from protein sequence alignments. , 2000, Journal of molecular biology.

[5]  Lan Huang,et al.  Structural basis for the interaction of Ras with RaIGDS , 1998, Nature Structural Biology.

[6]  A Wlodawer,et al.  Catalytic triads and their relatives. , 1998, Trends in biochemical sciences.

[7]  H. Watson,et al.  The Stereochemistry of the Protein Myoglobin , 1976 .

[8]  P. Hawkins,et al.  Crystal structure and functional analysis of Ras binding to its effector phosphoinositide 3-kinase gamma. , 2000, Cell.

[9]  Jimin Pei,et al.  AL2CO: calculation of positional conservation in a protein sequence alignment , 2001, Bioinform..

[10]  K. Kaibuchi,et al.  Small GTP-binding proteins. , 1992, International review of cytology.

[11]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments and family profiles , 1998, Nucleic Acids Res..

[12]  M. Perutz,et al.  The Croonian Lecture, 1968 - The haemoglobin molecule , 1969, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[13]  James E. Johnson,et al.  MetaFam: a unified classification of protein families. II. Schema and query capabilities , 2001, Bioinform..

[14]  Arne Elofsson,et al.  A study on protein sequence alignment quality , 2002, Proteins.

[15]  L. Mirny,et al.  Evolutionary conservation of the folding nucleus. , 2000, Journal of molecular biology.

[16]  P. Kollman,et al.  Biomolecular simulations: recent developments in force fields, simulations of enzyme catalysis, protein-ligand, protein-protein, and protein-nucleic acid noncovalent interactions. , 2001, Annual review of biophysics and biomolecular structure.

[17]  B. Honig,et al.  Classical electrostatics in biology and chemistry. , 1995, Science.

[18]  I. Kuntz,et al.  Using shape complementarity as an initial screen in designing ligands for a receptor binding site of known three-dimensional structure. , 1988, Journal of medicinal chemistry.

[19]  B. Erman,et al.  Information‐theoretical entropy as a measure of sequence variability , 1991, Proteins.

[20]  T. Blundell,et al.  Evolutionary trace analysis of TGF-beta and related growth factors: implications for site-directed mutagenesis. , 2000, Protein engineering.

[21]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[22]  M. Perutz Stereochemistry of Cooperative Effects in Haemoglobin: Haem–Haem Interaction and the Problem of Allostery , 1970, Nature.

[23]  C. Sander,et al.  A method to predict functional residues in proteins , 1995, Nature Structural Biology.

[24]  R. Huber,et al.  Structure of the complex formed by bovine trypsin and bovine pancreatic trypsin inhibitor. Crystal structure determination and stereochemistry of the contact region. , 1973, Journal of molecular biology.

[25]  R. Gennis,et al.  The highly conserved methionine of subunit I of the heme‐copper oxidases is not at the heme‐copper dinuclear center: Mutagenesis of M110 in subunit I of cytochrome bo 3‐type ubiquinol oxidase from Escherichia coli , 1995, FEBS letters.

[26]  G Vriend,et al.  WHAT IF: a molecular modeling and drug design program. , 1990, Journal of molecular graphics.

[27]  W. Kabsch,et al.  Refined crystal structure of the triphosphate conformation of H‐ras p21 at 1.35 A resolution: implications for the mechanism of GTP hydrolysis. , 1990, The EMBO journal.

[28]  Gert Vriend,et al.  A common motif in G-protein-coupled seven transmembrane helix receptors , 1993, J. Comput. Aided Mol. Des..

[29]  M L Lamb,et al.  Computational approaches to molecular recognition. , 1997, Current opinion in chemical biology.

[30]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[31]  John P. Overington,et al.  From comparisons of protein sequences and structures to protein modelling and design. , 1990, Trends in biochemical sciences.

[32]  James E. Johnson,et al.  MetaFam: a unified classification of protein families. I. Overview and statistics , 2001, Bioinform..

[33]  M J Sternberg,et al.  Supersites within superfolds. Binding site similarity in the absence of homology. , 1998, Journal of molecular biology.

[34]  Phillip T. Hawkins,et al.  Crystal Structure and Functional Analysis of Ras Binding to Its Effector Phosphoinositide 3-Kinase γ , 2000, Cell.

[35]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[36]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[37]  H. Wolfson,et al.  Flexible protein alignment and hinge detection , 2002, Proteins.

[38]  M. Perutz Stereochemistry of cooperative effects in haemoglobin. , 1970, Nature.

[39]  D. Eisenberg,et al.  Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. , 2001, Journal of molecular biology.

[40]  J Deisenhofer,et al.  Structure of the complex formed by bovine trypsin and bovine pancreatic trypsin inhibitor. II. Crystallographic refinement at 1.9 A resolution. , 1974, Journal of molecular biology.

[41]  M. Karplus,et al.  Functionality maps of binding sites: A multiple copy simultaneous search method , 1991, Proteins.

[42]  J. Kendrew,et al.  Structure of Deoxymyoglobin : A Crystallographic Study , 1966, Nature.

[43]  W. Hendrickson,et al.  Structural transitions upon ligand binding in a cooperative dimeric hemoglobin. , 1990, Science.

[44]  W. Kabsch,et al.  Structure of the guanine-nucleotide-binding domain of the Ha-ras oncogene product p21 in the triphosphate conformation , 1989, Nature.

[45]  N Linial,et al.  ProtoMap: Automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space , 1999, Proteins.

[46]  S. Jones,et al.  Prediction of protein-protein interaction sites using patch analysis. , 1997, Journal of molecular biology.