Determinants of adaptive evolution at the molecular level: the extended complexity hypothesis.

To explain why informational genes (i.e., those involved in transcription, translation, and related processes) are less likely than housekeeping genes to be horizontally transferred, Jain and coworkers proposed the complexity hypothesis. The underlying idea is that informational genes belong to large, complex systems of coevolving genes. Consequently, the likelihood of the successful horizontal transfer of a single gene from such an integrated system is expected to be low. Here, this hypothesis is extended to explain some of the determinants of the mode of evolution of coding sequences. It is proposed that genes belonging to complex systems are relatively less likely to be under adaptive evolution. To evaluate this "extended complexity hypothesis," 2,428 families and protein domains were analyzed. This analysis found that genes whose products are highly connected, located in intracellular components, and involved in complex processes and functions were more conserved and less likely to be under adaptive evolution than are other gene products. The extended complexity hypothesis suggests that both the mode and the rate of evolution of a protein are influenced by its gene ontology (localization, biological process, and molecular function) and by its connectivity.

[1]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[2]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[3]  Jerry A. Coyne,et al.  Genetics and speciation , 1992, Nature.

[4]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[5]  Joseph P Bielawski,et al.  Accuracy and power of bayes prediction of amino acid sites under positive selection. , 2002, Molecular biology and evolution.

[6]  Nick Goldman,et al.  Accuracy and Power of Statistical Methods for Detecting Adaptive Evolution in Protein Coding Sequences and for Identifying Positively Selected Sites , 2004, Genetics.

[7]  Ziheng Yang,et al.  Statistical methods for detecting molecular adaptation , 2000, Trends in Ecology & Evolution.

[8]  V. Yohai HIGH BREAKDOWN-POINT AND HIGH EFFICIENCY ROBUST ESTIMATES FOR REGRESSION , 1987 .

[9]  Ziheng Yang Inference of selection from multiple species alignments. , 2002, Current opinion in genetics & development.

[10]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[11]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[12]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[13]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[14]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[15]  T. Jukes,et al.  The neutral theory of molecular evolution. , 2000, Genetics.

[16]  H. Akashi,et al.  Gene expression and molecular evolution. , 2001, Current opinion in genetics & development.

[17]  Eugene V Koonin,et al.  No simple dependence between protein evolution rate and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly , 2003, BMC Evolutionary Biology.

[18]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[19]  T. Ohta Origin of the neutral and nearly neutral theories of evolution , 2003, Journal of Biosciences.

[20]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[21]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[22]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[23]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[24]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[25]  Y Suzuki,et al.  Reliabilities of parsimony-based and likelihood-based methods for detecting positive selection at single amino acid sites. , 2001, Molecular biology and evolution.

[26]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[27]  Dennis P Wall,et al.  A simple dependence between protein evolution rate and the number of protein-protein interactions , 2003, BMC Evolutionary Biology.

[28]  Victor Kunin,et al.  Functional evolution of the yeast protein interaction network. , 2004, Molecular biology and evolution.

[29]  Simon Whelan,et al.  Pandit: a database of protein and associated nucleotide domains with inferred trees , 2003, Bioinform..

[30]  A. E. Hirsh,et al.  Evolutionary Rate in the Protein Interaction Network , 2002, Science.

[31]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[32]  A. E. Hirsh,et al.  Protein dispensability and rate of evolution , 2001, Nature.

[33]  Natalia Maltsev,et al.  WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction , 2000, Nucleic Acids Res..

[34]  J. Lake,et al.  Horizontal gene transfer among genomes: the complexity hypothesis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[35]  C. Pál,et al.  Highly expressed genes in yeast evolve slowly. , 2001, Genetics.

[36]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[37]  Z. Yang,et al.  Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. , 2001, Molecular biology and evolution.

[38]  C. Pál,et al.  Genomic function: Rate of evolution and gene dispensability. , 2003, Nature.

[39]  Masatoshi Nei,et al.  Simulation study of the reliability and robustness of the statistical methods for detecting positive selection at single amino acid sites. , 2002, Molecular biology and evolution.

[40]  K. Shaw The genealogical view of speciation , 2001 .

[41]  S. Whelan,et al.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. , 2001, Molecular biology and evolution.

[42]  Joaquín Dopazo,et al.  FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes , 2004, Bioinform..