The evolution of protein domain families.

Protein domains are the common currency of protein structure and function. Over 10,000 such protein families have now been collected in the Pfam database. Using these data along with animal gene phylogenies from TreeFam allowed us to investigate the gain and loss of protein domains. Most gains and losses of domains occur at protein termini. We show that the nature of changes is similar after speciation or duplication events. However, changes in domain architecture happen at a higher frequency after gene duplication. We suggest that the bias towards protein termini is largely because insertion and deletion of domains at most positions in a protein are likely to disrupt the structure of existing domains. We can also use Pfam to trace the evolution of specific families. For example, the immunoglobulin superfamily can be traced over 500 million years during its expansion into one of the largest families in the human genome. It can be shown that this protein family has its origins in basic animals such as the poriferan sponges where it is found in cell-surface-receptor proteins. We can trace how the structure and sequence of this family diverged during vertebrate evolution into constant and variable domains that are found in the antibodies of our immune system as well as in neural and muscle proteins.

[1]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[2]  B. Rinkevich,et al.  Cell adhesion receptors and nuclear receptors are highly conserved from the lowest metazoa (marine sponges) to vertebrates. , 1994, Biological chemistry Hoppe-Seyler.

[3]  C Chothia,et al.  Many of the immunoglobulin superfamily domains in cell adhesion molecules and surface receptors belong to a new structural set which is close to that containing variable domains. , 1994, Journal of molecular biology.

[4]  C. Sander,et al.  Parser for protein folding units , 1994, Proteins.

[5]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[6]  L. Patthy Genome evolution and the evolution of exon-shuffling--a review. , 1999, Gene.

[7]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[8]  S Das,et al.  Identifying nature's protein Lego set. , 2000, Advances in protein chemistry.

[9]  Andreas Wagner,et al.  Selection and gene duplication: a view from the genome , 2002, Genome Biology.

[10]  C. Chothia,et al.  Evolution of the Protein Repertoire , 2003, Science.

[11]  Gail J. Bartlett,et al.  Catalysing new reactions during evolution: economy of residues and mechanism. , 2003, Journal of molecular biology.

[12]  Jianzhi Zhang Evolution by gene duplication: an update , 2003 .

[13]  A. van Rijk,et al.  Molecular Mechanisms of Exon Shuffling: Illegitimate Recombination , 2003, Genetica.

[14]  Birgit Pils,et al.  Inactive enzyme-homologues find new function in regulatory processes. , 2004, Journal of molecular biology.

[15]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[16]  A. Elofsson,et al.  Domain rearrangements in protein evolution. , 2005, Journal of molecular biology.

[17]  A. Grigoriev,et al.  Significant expansion of exon-bordering protein domains during animal proteome evolution , 2005, Nucleic acids research.

[18]  E. Vargas-Madrazo,et al.  Substitution patterns in alleles of immunoglobulin V genes in humans and mice. , 2006, Molecular immunology.

[19]  E. Bornberg-Bauer,et al.  Domain deletions and substitutions in the modular protein evolution , 2006, The FEBS journal.

[20]  Zhou Yu,et al.  Ig-like domains on bacteriophages: a tale of promiscuity and deceit. , 2006, Journal of molecular biology.

[21]  Pierre Brézellec,et al.  Gene fusion/fission is a major contributor to evolution of multi-domain bacterial proteins , 2006, Bioinform..

[22]  Cyrus Chothia,et al.  Protein Family Expansions and Biological Complexity , 2006, PLoS Comput. Biol..

[23]  C. Pál,et al.  An integrated view of protein evolution , 2006, Nature Reviews Genetics.

[24]  E. Ostertag,et al.  Current topics in genome evolution: Molecular mechanisms of new gene formation , 2007, Cellular and Molecular Life Sciences.

[25]  A. Elofsson,et al.  Quantification of the elevated rate of domain rearrangements in metazoa. , 2007, Journal of molecular biology.

[26]  M. Ruggero,et al.  Similarity of Traveling-Wave Delays in the Hearing Organs of Humans and Other Tetrapods , 2007, Journal for the Association for Research in Otolaryngology.

[27]  Pernille R. Jensen,et al.  Continuous Molecular Evolution of Protein-Domain Structures by Single Amino Acid Changes , 2007, Current Biology.

[28]  Robert D. Finn,et al.  Pfam 10 years on: 10 000 families and still growing , 2008, Briefings Bioinform..

[29]  Andrew D. Moore,et al.  Arrangements in the modular evolution of proteins. , 2008, Trends in biochemical sciences.

[30]  E. Koonin,et al.  Evolution of protein domain promiscuity in eukaryotes. , 2008, Genome research.