The Ortholog Conjecture Is Untestable by the Current Gene Ontology but Is Supported by RNA Sequencing Data

The ortholog conjecture posits that orthologous genes are functionally more similar than paralogous genes. This conjecture is a cornerstone of phylogenomics and is used daily by both computational and experimental biologists in predicting, interpreting, and understanding gene functions. A recent study, however, challenged the ortholog conjecture on the basis of experimentally derived Gene Ontology (GO) annotations and microarray gene expression data in human and mouse. It instead proposed that the functional similarity of homologous genes is primarily determined by the cellular context in which the genes act, explaining why a greater functional similarity of (within-species) paralogs than (between-species) orthologs was observed. Here we show that GO-based functional similarity between human and mouse orthologs, relative to that between paralogs, has been increasing in the last five years. Further, compared with paralogs, orthologs are less likely to be included in the same study, causing an underestimation in their functional similarity. A close examination of functional studies of homologs with identical protein sequences reveals experimental biases, annotation errors, and homology-based functional inferences that are labeled in GO as experimental. These problems and the temporary nature of the GO-based finding make the current GO inappropriate for testing the ortholog conjecture. RNA sequencing (RNA-Seq) is known to be superior to microarray for comparing the expressions of different genes or in different species. Our analysis of a large RNA-Seq dataset of multiple tissues from eight mammals and the chicken shows that the expression similarity between orthologs is significantly higher than that between within-species paralogs, supporting the ortholog conjecture and refuting the cellular context hypothesis for gene expression. We conclude that the ortholog conjecture remains largely valid to the extent that it has been tested, but further scrutiny using more and better functional data is needed.

[1]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[2]  C. Pál,et al.  Highly expressed genes in yeast evolve slowly. , 2001, Genetics.

[3]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[4]  María Martín,et al.  The Gene Ontology: enhancements for 2011 , 2011, Nucleic Acids Res..

[5]  Jianzhi Zhang Evolution by gene duplication: an update , 2003 .

[6]  Asa Ben-Hur,et al.  The use of gene ontology evidence codes in preventing classifier assessment bias , 2009, Bioinform..

[7]  Joaquín Dopazo,et al.  Evidence for short-time divergence and long-time conservation of tissue-specific expression after gene duplication , 2011, Briefings Bioinform..

[8]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[9]  J A Eisen,et al.  Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. , 1998, Genome research.

[10]  S. Bergmann,et al.  The evolution of gene expression levels in mammalian organs , 2011, Nature.

[11]  Jianzhi Zhang,et al.  Significant impact of protein dispensability on the instantaneous rate of protein evolution. , 2005, Molecular biology and evolution.

[12]  Christophe Dessimoz,et al.  Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs , 2012, PLoS Comput. Biol..

[13]  Wenfeng Qian,et al.  Protein Subcellular Relocalization in the Evolution of Yeast Singleton and Duplicate Genes , 2009, Genome biology and evolution.

[14]  Jianzhi Zhang,et al.  RNA sequencing shows no dosage compensation of the active X-chromosome , 2010, Nature Genetics.

[15]  E. Koonin Orthologs, Paralogs, and Evolutionary Genomics 1 , 2005 .

[16]  M. Gerstein,et al.  Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. , 2004, Genome research.

[17]  Y. Masuho,et al.  Interaction of the Unc-51-like kinase and microtubule-associated protein light chain 3 related proteins in the brain: possible role of vesicular transport in axonal elongation. , 2000, Brain research. Molecular brain research.

[18]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[19]  I. Yanai,et al.  Incongruent expression profiles between human and mouse orthologous genes suggest widespread neutral evolution of transcription control. , 2004, Omics : a journal of integrative biology.

[20]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[21]  Claus O. Wilke,et al.  Mistranslation-Induced Protein Misfolding as a Dominant Constraint on Coding-Sequence Evolution , 2008, Cell.

[22]  David Waxman,et al.  A Problem With the Correlation Coefficient as a Measure of Gene Expression Divergence , 2009, Genetics.

[23]  Jianzhi Zhang,et al.  Null mutations in human and mouse orthologs frequently result in different phenotypes , 2008, Proceedings of the National Academy of Sciences.

[24]  Holger Gerhardt,et al.  Nrarp coordinates endothelial Notch and Wnt signaling to control vessel density in angiogenesis. , 2009, Developmental cell.

[25]  Burkhard Rost,et al.  Protein–Protein Interactions More Conserved within Species than across Species , 2006, PLoS Comput. Biol..

[26]  P. Hall,et al.  Mammalian prohibitin proteins respond to mitochondrial stress and decrease during cellular senescence. , 2001, Experimental cell research.

[27]  Jian-Rong Yang,et al.  Impact of translational error-induced and error-free misfolding on the rate of protein evolution , 2010, Molecular systems biology.

[28]  M. Shekhar,et al.  Rad6B is a positive regulator of beta-catenin stabilization. , 2008, Cancer research.

[29]  D. Botstein,et al.  Orthology and functional conservation in eukaryotes. , 2007, Annual review of genetics.

[30]  Dongsup Kim,et al.  Analysis of a genome-wide set of gene deletions in the fission yeast Schizosaccharomyces pombe , 2010, Nature Biotechnology.

[31]  M. Hengartner,et al.  CED-12/ELMO, a Novel Member of the CrkII/Dock180/Rac Pathway, Is Required for Phagocytosis and Cell Migration , 2001, Cell.

[32]  Sven Bergmann,et al.  Correcting for the bias due to expression specificity improves the estimation of constrained evolution of expression between mouse and human , 2012, Bioinform..

[33]  Marc Robinson-Rechavi,et al.  When orthologs diverge between human and mouse , 2011, Briefings Bioinform..

[34]  Kunihiro Matsumoto,et al.  Role of the TAK1-NLK-STAT3 pathway in TGF-β-mediated mesoderm induction , 2004 .

[35]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[36]  M. Robinson‐Rechavi,et al.  How confident can we be that orthologs are similar, but paralogs differ? , 2009, Trends in genetics : TIG.

[37]  Andrew Ying-Fei Chang,et al.  Maintenance of duplicate genes and their functional redundancy by reduced expression. , 2010, Trends in genetics : TIG.

[38]  Jianzhi Zhang A panorama of mammalian gene expression evolution , 2011, Molecular systems biology.

[39]  S. Riva,et al.  Rac3-induced neuritogenesis requires binding to Neurabin I. , 2006, Molecular biology of the cell.

[40]  Judith A. Blake,et al.  On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report , 2012, PLoS Comput. Biol..

[41]  L. López-Fernández,et al.  Ilf2 is regulated during meiosis and associated to transcriptionally active chromatin , 2002, Mechanisms of Development.

[42]  E. Koonin,et al.  Orthology, paralogy and proposed classification for paralog subtypes. , 2002, Trends in genetics : TIG.

[43]  Zheng Guo,et al.  Broadly predicting specific gene functions with expression similarity and taxonomy similarity. , 2005, Gene.

[44]  Jianzhi Zhang,et al.  Evolutionary conservation of expression profiles between human and mouse orthologous genes. , 2006, Molecular biology and evolution.

[45]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[46]  Kriston L. McGary,et al.  Systematic discovery of nonobvious human disease models through orthologous phenotypes , 2010, Proceedings of the National Academy of Sciences.

[47]  Predrag Radivojac,et al.  Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals , 2011, PLoS Comput. Biol..

[48]  A. Force,et al.  Preservation of duplicate genes by complementary, degenerative mutations. , 1999, Genetics.

[49]  S. Pääbo,et al.  Parallel Patterns of Evolution in the Genomes and Transcriptomes of Humans and Chimpanzees , 2005, Science.

[50]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[51]  Wenfeng Qian,et al.  Measuring the evolutionary rate of protein–protein interaction , 2011, Proceedings of the National Academy of Sciences.

[52]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[53]  Jian-Rong Yang,et al.  Protein misinteraction avoidance causes highly expressed proteins to evolve slowly , 2012, Proceedings of the National Academy of Sciences.

[54]  A. Piquero,et al.  USING THE CORRECT STATISTICAL TEST FOR THE EQUALITY OF REGRESSION COEFFICIENTS , 1998 .

[55]  Michael I. Jordan,et al.  Protein Molecular Function Prediction by Bayesian Phylogenomics , 2005, PLoS Comput. Biol..

[56]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[57]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[58]  Kunihiro Matsumoto,et al.  Role of the TAK1-NLK-STAT3 pathway in TGF-beta-mediated mesoderm induction. , 2004, Genes & development.

[59]  C. Clogg,et al.  Statistical Methods for Comparing Regression Coefficients Between Models , 1995, American Journal of Sociology.

[60]  P. Khaitovich,et al.  BMC Genomics BioMed Central Methodology article Estimating accuracy of RNA-Seq and microarrays with proteomics , 2022 .

[61]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..