Advances and Applications in the Quest for Orthologs

Abstract Gene families evolve by the processes of speciation (creating orthologs), gene duplication (paralogs), and horizontal gene transfer (xenologs), in addition to sequence divergence and gene loss. Orthologs in particular play an essential role in comparative genomics and phylogenomic analyses. With the continued sequencing of organisms across the tree of life, the data are available to reconstruct the unique evolutionary histories of tens of thousands of gene families. Accurate reconstruction of these histories, however, is a challenging computational problem, and the focus of the Quest for Orthologs Consortium. We review the recent advances and outstanding challenges in this field, as revealed at a symposium and meeting held at the University of Southern California in 2017. Key advances have been made both at the level of orthology algorithm development and with respect to coordination across the community of algorithm developers and orthology end-users. Applications spanned a broad range, including gene function prediction, phylostratigraphy, genome evolution, and phylogenomics. The meetings highlighted the increasing use of meta-analyses integrating results from multiple different algorithms, and discussed ongoing challenges in orthology inference as well as the next steps toward improvement and integration of orthology resources.

[1]  A. von Haeseler,et al.  The Evolutionary Traceability of a Protein , 2019, Genome biology and evolution.

[2]  Miguel Pignatelli,et al.  iHam and pyHam: visualizing and processing hierarchical orthologous groups , 2018, Bioinform..

[3]  Hirokazu Chiba,et al.  MBGD update 2018: microbial genome database based on hierarchical orthology relations covering closely related and distantly related comparisons , 2018, Nucleic Acids Res..

[4]  Anushya Muruganujan,et al.  PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools , 2018, Nucleic Acids Res..

[5]  The UniProt Consortium,et al.  UniProt: a worldwide hub of protein knowledge , 2018, Nucleic Acids Res..

[6]  Olivier Poch,et al.  OrthoInspector 3.0: open portal for comparative genomics , 2018, Nucleic Acids Res..

[7]  Anushya Muruganujan,et al.  Ancestral Genomes: a resource for reconstructed ancestral genes and genomes across the tree of life , 2018, Nucleic Acids Res..

[8]  Bastian Greshake Tzovaras,et al.  PhyloProfile: dynamic visualization and exploration of multi-layered phylogenetic profiles , 2018, Bioinform..

[9]  Woojin Kim,et al.  OrthoList 2: A New Comparative Genomic Analysis of Human and Caenorhabditis elegans Genes , 2018, Genetics.

[10]  David Sankoff,et al.  Accurate prediction of orthologs in the presence of divergence after duplication , 2018, bioRxiv.

[11]  Mateusz Kaduk,et al.  FunCoup 4: new species, data, and visualization , 2017, Nucleic Acids Res..

[12]  Gaston H. Gonnet,et al.  The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces , 2017, Nucleic Acids Res..

[13]  Johannes Söding,et al.  MMseqs2: sensitive protein sequence searching for the analysis of massive data sets , 2017, bioRxiv.

[14]  Alfonso Valencia,et al.  Lessons Learned: Recommendations for Establishing Critical Periodic Scientific Benchmarking , 2017, bioRxiv.

[15]  Jesualdo Tomás Fernández-Breis,et al.  Gearing up to handle the mosaic nature of life in the quest for orthologs , 2017, Bioinform..

[16]  Gaston H. Gonnet,et al.  Orthologous Matrix (OMA) algorithm 2.0: more robust to asymmetric evolutionary rates and more scalable hierarchical orthologous group inference , 2017, Bioinform..

[17]  M. Albà,et al.  New Genes and Functional Innovation in Mammals , 2017, bioRxiv.

[18]  Claire D. McWhite,et al.  Systematic bacterialization of yeast genes identifies a near-universally swappable pathway , 2017, eLife.

[19]  Bronwen L. Aken,et al.  Ensembl comparative genomics update – HMMs and Orthology QC , 2017 .

[20]  Yanhui Hu,et al.  Gene2Function: An Integrated Online Resource for Gene Function Discovery , 2017, G3: Genes, Genomes, Genetics.

[21]  J. Thompson,et al.  Insights into Ciliary Genes and Evolution from Multi-Level Phylogenetic Profiling , 2017, Molecular biology and evolution.

[22]  Mateusz Kaduk,et al.  Improved orthology inference with Hieranoid 2 , 2017, Bioinform..

[23]  Andrzej Zielezinski,et al.  ORCAN—a web‐based meta‐server for real‐time detection and functional annotation of orthologs , 2017, Bioinform..

[24]  Silvio C. E. Tosatto,et al.  InterPro in 2017—beyond protein family and domain annotations , 2016, Nucleic Acids Res..

[25]  Ron Korstanje,et al.  WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning , 2016, PLoS Comput. Biol..

[26]  Mateusz Kaduk,et al.  HieranoiDB: a database of orthologs inferred by Hieranoid , 2016, Nucleic Acids Res..

[27]  Luis Pedro Coelho,et al.  Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper , 2016, bioRxiv.

[28]  Nadia El-Mabrouk,et al.  Efficient Gene Tree Correction Guided by Genome Evolution , 2016, PloS one.

[29]  Henning Redestig,et al.  Homoeologs: What Are They and How Do We Infer Them? , 2016, Trends in plant science.

[30]  J. Moncalvo,et al.  Genome-Wide Survey of Gut Fungi (Harpellales) Reveals the First Horizontally Transferred Ubiquitin Gene from a Mosquito Host , 2016, Molecular biology and evolution.

[31]  Adrian M. Altenhoff,et al.  Standardized benchmarking in the quest for orthologs , 2016, Nature Methods.

[32]  Claire D. McWhite,et al.  Towards Consensus Gene Ages , 2016, bioRxiv.

[33]  Davide Heller,et al.  eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences , 2015, Nucleic Acids Res..

[34]  Toni Gabaldón,et al.  Beyond the Whole-Genome Duplication: Phylogenetic Evidence for an Ancient Interspecies Hybridization in the Baker's Yeast Lineage , 2015, PLoS biology.

[35]  S. Lewis,et al.  Quest for Orthologs Entails Quest for Tree of Life: In Search of the Gene Stream , 2015, Genome biology and evolution.

[36]  Austin G. Meyer,et al.  Systematic humanization of yeast genes reveals conserved functions and genetic modularity , 2015, Science.

[37]  Erik L. L. Sonnhammer,et al.  InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic , 2014, Nucleic Acids Res..

[38]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[39]  Xinxia Peng,et al.  The draft genome sequence of the ferret (Mustela putorius furo) facilitates study of human respiratory disease , 2014, Nature Biotechnology.

[40]  Alain Denise,et al.  A meta-approach for improving the prediction and the functional annotation of ortholog groups , 2014, BMC Genomics.

[41]  Maria Jesus Martin,et al.  Big data and other challenges in the quest for orthologs , 2014, Bioinform..

[42]  Paul Pavlidis,et al.  Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA) , 2013, BMC Bioinformatics.

[43]  E. Koonin,et al.  Functional and evolutionary implications of gene orthology , 2013, Nature Reviews Genetics.

[44]  Gaston H. Gonnet,et al.  Inferring Hierarchical Orthologous Groups from Orthologous Gene Pairs , 2013, PloS one.

[45]  Krister M. Swenson,et al.  Gene trees and species trees: irreconcilable differences , 2012, BMC Bioinformatics.

[46]  Christophe Dessimoz,et al.  Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs , 2012, PLoS Comput. Biol..

[47]  Javier Herrero,et al.  Toward community standards in the quest for orthologs , 2012, Bioinform..

[48]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[49]  Suzanna Lewis,et al.  Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium , 2011, Briefings Bioinform..

[50]  Bonnie Berger,et al.  An integrative approach to ortholog prediction for disease-focused and other functional studies , 2011, BMC Bioinformatics.

[51]  Predrag Radivojac,et al.  Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals , 2011, PLoS Comput. Biol..

[52]  Iva Greenwald,et al.  OrthoList: A Compendium of C. elegans Genes with Human Orthologs , 2011, PloS one.

[53]  Robert D. Finn,et al.  Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation , 2011, PloS one.

[54]  Leszek P. Pryszcz,et al.  MetaPhOrs: orthology and paralogy predictions from multiple phylogenetic evidence using a consistency-based confidence score , 2010, Nucleic acids research.

[55]  Anushya Muruganujan,et al.  PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium , 2009, Nucleic Acids Res..

[56]  Albert J. Vilella,et al.  Joining forces in the quest for orthologs , 2009, Genome Biology.

[57]  Michael J. Lush,et al.  HCOP: a searchable database of human orthology predictions , 2006, Briefings Bioinform..

[58]  Matthew Hurles,et al.  Gene Duplication: The Genomic Trade in Spare Parts , 2004, PLoS biology.

[59]  W. Fitch Homology a personal view on some of the problems. , 2000, Trends in genetics : TIG.

[60]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[61]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[62]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[63]  The UniProt Consortium UniProt: the universal protein knowledgebase , 2016, Nucleic Acids Res..

[64]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[65]  F. Maytag Evolution , 1996, Arch. Mus. Informatics.