Structure is three to ten times more conserved than sequence—A study of structural response in protein cores

Protein structures change during evolution in response to mutations. Here, we analyze the mapping between sequence and structure in a set of structurally aligned protein domains. To avoid artifacts, we restricted our attention only to the core components of these structures. We found that on average, using different measures of structural change, protein cores evolve linearly with evolutionary distance (amino acid substitutions per site). This is true irrespective of which measure of structural change we used, whether RMSD or discrete structural descriptors for secondary structure, accessibility, or contacts. This linear response allows us to quantify the claim that structure is more conserved than sequence. Using structural alphabets of similar cardinality to the sequence alphabet, structural cores evolve three to ten times slower than sequences. Although we observed an average linear response, we found a wide variance. Different domain families varied fivefold in structural response to evolution. An attempt to categorically analyze this variance among subgroups by structural and functional category revealed only one statistically significant trend. This trend can be explained by the fact that beta‐sheets change faster than alpha‐helices, most likely due to that they are shorter and that change occurs at the ends of the secondary structure elements. Proteins 2009. © 2009 Wiley‐Liss, Inc.

[1]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[2]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[3]  B. Matthews,et al.  A mutant T4 lysozyme displays five different crystal conformations , 1990, Nature.

[4]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[5]  M. Levitt,et al.  Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core , 1993, Current Biology.

[6]  R A Goldstein,et al.  Context-dependent optimal substitution matrices. , 1995, Protein engineering.

[7]  B. Matthews,et al.  Studies on protein stability with T4 lysozyme. , 1995, Advances in protein chemistry.

[8]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[9]  P. Argos,et al.  Knowledge‐based protein secondary structure assignment , 1995, Proteins.

[10]  Correlating structure-dependent mutation matrices with physical-chemical properties. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[11]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[12]  R A Goldstein,et al.  Mutation matrices and physical‐chemical properties: Correlations and implications , 1997, Proteins.

[13]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[14]  W. Pearson,et al.  Evolution of protein sequences and structures. , 1999, Journal of molecular biology.

[15]  D. Hartl,et al.  Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. , 2000, Molecular biology and evolution.

[16]  Richard A. Goldstein,et al.  Analyzing Rate Heterogeneity During Protein Evolution , 2000, Pacific Symposium on Biocomputing.

[17]  Martin Vingron,et al.  TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing , 2002, Bioinform..

[18]  Claudia Neuhauser,et al.  The Pattern of Amino Acid Replacements in α/β-Barrels , 2002 .

[19]  Claudia Neuhauser,et al.  The pattern of amino acid replacements in alpha/beta-barrels. , 2002, Molecular biology and evolution.

[20]  Arne Elofsson,et al.  A study on protein sequence alignment quality , 2002, Proteins.

[21]  A. Elofsson,et al.  Can correct protein models be identified? , 2003, Protein science : a publication of the Protein Society.

[22]  Thomas Madej,et al.  Structural similarity of loops in protein families: toward the understanding of protein evolution , 2005, BMC Evolutionary Biology.

[23]  E. Sandelin,et al.  On hydrophobicity and conformational specificity in proteins. , 2004, Biophysical journal.

[24]  Thomas Madej,et al.  Evolutionary plasticity of protein families: Coupling between sequence and structure variation , 2005, Proteins.

[25]  L. Marsh,et al.  Protein structural influences in rhodopsin evolution. , 2005, Molecular biology and evolution.

[26]  Gabrielle A. Reeves,et al.  Structural diversity of domain superfamilies in the CATH database. , 2006, Journal of molecular biology.

[27]  C. Pál,et al.  An integrated view of protein evolution , 2006, Nature Reviews Genetics.

[28]  Cyrus Chothia,et al.  The SUPERFAMILY database in 2007: families and functions , 2006, Nucleic Acids Res..

[29]  L. H. Bradley,et al.  Protein design by binary patterning of polar and nonpolar amino acids. , 1993, Methods in molecular biology.

[30]  R. Kolodny,et al.  Sequence-similar, structure-dissimilar protein pairs in the PDB , 2007, Proteins.