A Tight Link between Orthologs and Bidirectional Best Hits in Bacterial and Archaeal Genomes

Orthologous relationships between genes are routinely inferred from bidirectional best hits (BBH) in pairwise genome comparisons. However, to our knowledge, it has never been quantitatively demonstrated that orthologs form BBH. To test this “BBH-orthology conjecture,” we take advantage of the operon organization of bacterial and archaeal genomes and assume that, when two genes in compared genomes are flanked by two BBH show statistically significant sequence similarity to one another, these genes are bona fide orthologs. Under this assumption, we tested whether middle genes in “syntenic orthologous gene triplets” form BBH. We found that this was the case in more than 95% of the syntenic gene triplets in all genome comparisons. A detailed examination of the exceptions to this pattern, including maximum likelihood phylogenetic tree analysis, showed that some of these deviations involved artifacts of genome annotation, whereas very small fractions represented random assignment of the best hit to one of closely related in-paralogs, paralogous displacement in situ, or even less frequent genuine violations of the BBH–orthology conjecture caused by acceleration of evolution in one of the orthologs. We conclude that, at least in prokaryotes, genes for which independent evidence of orthology is available typically form BBH and, conversely, BBH can serve as a strong indication of gene orthology.

[1]  Predrag Radivojac,et al.  Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals , 2011, PLoS Comput. Biol..

[2]  M. Huynen,et al.  Benchmarking ortholog identification methods using functional genomics data , 2006, Genome Biology.

[3]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[4]  Judith A. Blake,et al.  On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report , 2012, PLoS Comput. Biol..

[5]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[6]  B. Labedan,et al.  Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data , 2007, BMC Evolutionary Biology.

[7]  Colin N. Dewey Positional orthology: putting genomic evolutionary relationships into context , 2011, Briefings Bioinform..

[8]  E. Koonin,et al.  Orthology, paralogy and proposed classification for paralog subtypes. , 2002, Trends in genetics : TIG.

[9]  Jane Lomax,et al.  Get ready to GO! A biologist's guide to the Gene Ontology , 2005, Briefings Bioinform..

[10]  Christophe Dessimoz,et al.  Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs , 2012, PLoS Comput. Biol..

[11]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[12]  E. Koonin,et al.  Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ , 2003, Genome Biology.

[13]  Eugene V Koonin,et al.  Evolution of genome architecture. , 2009, The international journal of biochemistry & cell biology.

[14]  Kevin J. Liu,et al.  RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation , 2011, PloS one.

[15]  E. Koonin Orthologs, Paralogs, and Evolutionary Genomics 1 , 2005 .

[16]  Joaquín Dopazo,et al.  Evidence for short-time divergence and long-time conservation of tissue-specific expression after gene duplication , 2011, Briefings Bioinform..

[17]  A. Sali,et al.  Evolutionary constraints on structural similarity in orthologs and paralogs , 2009, Protein science : a publication of the Protein Society.

[18]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[19]  Christophe Dessimoz,et al.  Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods , 2009, PLoS Comput. Biol..

[20]  L. Koski,et al.  The Closest BLAST Hit Is Often Not the Nearest Neighbor , 2001, Journal of Molecular Evolution.

[21]  Arcady R. Mushegian,et al.  Computational methods for Gene Orthology inference , 2011, Briefings Bioinform..

[22]  K. Holsinger The neutral theory of molecular evolution , 2004 .

[23]  Christophe Dessimoz,et al.  Quality of Computationally Inferred Gene Ontology Annotations , 2012, PLoS Comput. Biol..

[24]  W. Fitch Homology a personal view on some of the problems. , 2000, Trends in genetics : TIG.

[25]  M. Kimura,et al.  The neutral theory of molecular evolution. , 1983, Scientific American.

[26]  Erik L. L. Sonnhammer,et al.  Domain architecture conservation in orthologs , 2011, BMC Bioinformatics.