Evolutionary plasticity of protein families: Coupling between sequence and structure variation

In this work we examine how protein structural changes are coupled with sequence variation in the course of evolution of a family of homologs. The sequence–structure correlation analysis performed on 81 homologous protein families shows that the majority of them exhibit statistically significant linear correlation between the measures of sequence and structural similarity. We observed, however, that there are cases where structural variability cannot be mainly explained by sequence variation, such as protein families with a number of disulfide bonds. To understand whether structures from different families and/or folds evolve in the same manner, we compared the degrees of structural change per unit of sequence change (“the evolutionary plasticity of structure”) between those families with a significant linear correlation. Using rigorous statistical procedures we find that, with a few exceptions, evolutionary plasticity does not show a statistically significant difference between protein families. Similar sequence–structure analysis performed for protein loop regions shows that evolutionary plasticity of loop regions is greater than for the protein core. Proteins 2005. © 2005 Wiley‐Liss, Inc.

[1]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. II. On the relationship between sequence and structural similarity for proteins that are not obviously related in sequence. , 2000, Journal of molecular biology.

[2]  E. Koonin,et al.  The structure of the protein universe and genome evolution , 2002, Nature.

[3]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[4]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[5]  F. James Rohlf,et al.  Biometry: The Principles and Practice of Statistics in Biological Research , 1969 .

[6]  O. Ptitsyn,et al.  Why do globular proteins fit the limited set of folding patterns? , 1987, Progress in biophysics and molecular biology.

[7]  T. Jukes,et al.  The neutral theory of molecular evolution. , 2000, Genetics.

[8]  O. Ptitsyn,et al.  Similarities of protein topologies: evolutionary divergence, functional convergence or principles of folding? , 1980, Quarterly Reviews of Biophysics.

[9]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[10]  S. Bryant,et al.  Identification of homologous core structures , 1999, Proteins.

[11]  S H Bryant,et al.  Measures of threading specificity and accuracy , 1997, Proteins.

[12]  M. Kimura The Neutral Theory of Molecular Evolution: Introduction , 1983 .

[13]  John B. Anderson,et al.  CDD: a curated Entrez database of conserved domain alignments , 2003, Nucleic Acids Res..

[14]  M. Sternberg,et al.  Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. , 1997, Journal of molecular biology.

[15]  A. Mclachlan Gene duplications in the structural evolution of chymotrypsin. , 1979, Journal of molecular biology.

[16]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[17]  M. Gerstein,et al.  Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. , 2001, Journal of molecular biology.

[18]  Eugene I Shakhnovich,et al.  Expanding protein universe and its origin from the biological Big Bang , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[19]  V. Arcus OB-fold domains: a snapshot of the evolution of sequence, structure and function. , 2002, Current opinion in structural biology.

[20]  A. Murzin How far divergent evolution goes in proteins. , 1998, Current opinion in structural biology.

[21]  Ke Fan,et al.  PROTEINS: Structure, Function, and Bioinformatics 54:491–499 (2004) The Number of Protein Folds and Their Distribution Over Families in Nature , 2022 .

[22]  P. Bork,et al.  Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways. , 2000, Journal of molecular biology.

[23]  C. Chothia Proteins. One thousand families for the molecular biologist. , 1992, Nature.

[24]  Yanli Wang,et al.  MMDB: Entrez's 3D-structure database , 2003, Nucleic Acids Res..

[25]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[26]  Eric J. Deeds,et al.  Proteomic traces of speciation. , 2004, Journal of molecular biology.

[27]  Lvek,et al.  Evolution of protein structures and functions , 2022 .

[28]  Thomas Madej,et al.  Analysis of protein homology by assessing the (dis)similarity in protein loop regions , 2004, Proteins.

[29]  C DeLisi,et al.  Estimating the number of protein folds. , 1998, Journal of molecular biology.

[30]  M. Murphy,et al.  Structural comparison of cupredoxin domains: Domain recycling to construct proteins with novel functions , 1997, Protein science : a publication of the Protein Society.

[31]  John B. Anderson,et al.  MMDB: Entrez's 3D-structure database , 2002, Nucleic Acids Res..

[32]  Robert R. Sokal,et al.  The Principles and Practice of Statistics in Biological Research. , 1982 .

[33]  John Moult,et al.  A unifold, mesofold, and superfold model of protein fold use , 2002, Proteins.

[34]  R B Russell,et al.  Identification of distant homologues of fibroblast growth factors suggests a common ancestor for all beta-trefoil proteins. , 2000, Journal of molecular biology.

[35]  W. Pearson,et al.  Evolution of protein sequences and structures. , 1999, Journal of molecular biology.

[36]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[37]  S. Bryant,et al.  CDART: protein homology by domain architecture. , 2002, Genome research.

[38]  Patrice Koehl,et al.  Sequence variations within protein families are linearly related to structural variations. , 2002, Journal of molecular biology.

[39]  T. Bhat,et al.  The Protein Data Bank and the challenge of structural genomics , 2000, Nature Structural Biology.

[40]  D. Baker,et al.  Contact order, transition state placement and the refolding rates of single domain proteins. , 1998, Journal of molecular biology.

[41]  Thomas Madej,et al.  Structural similarity of loops in protein families: toward the understanding of protein evolution , 2005, BMC Evolutionary Biology.

[42]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[43]  David A. Lee,et al.  Progress towards mapping the universe of protein folds , 2004, Genome Biology.

[44]  E V Koonin,et al.  Estimating the number of protein folds and families from complete genome data. , 2000, Journal of molecular biology.

[45]  T. P. Flores,et al.  Comparison of conformational characteristics in structurally similar protein pairs , 1993, Protein science : a publication of the Protein Society.

[46]  Benjamin A. Shoemaker,et al.  CDD: a database of conserved domain alignments with links to domain three-dimensional structure , 2002, Nucleic Acids Res..