From event-labeled gene trees to species trees

BackgroundTree reconciliation problems have long been studied in phylogenetics. A particular variant of the reconciliation problem for a gene tree T and a species tree S assumes that for each interior vertex x of T it is known whether x represents a speciation or a duplication. This problem appears in the context of analyzing orthology data.ResultsWe show that S is a species tree for T if and only if S displays all rooted triples of T that have three distinct species as their leaves and are rooted in a speciation vertex. A valid reconciliation map can then be found in polynomial time. Simulated data shows that the event-labeled gene trees convey a large amount of information on underlying species trees, even for a large percentage of losses.ConclusionsThe knowledge of event labels in a gene tree strongly constrains the possible species tree and, for a given species tree, also the possible reconciliation maps. Nevertheless, many degrees of freedom remain in the space of feasible solutions. In order to disambiguate the alternative solutions additional external constraints as well as optimization criteria could be employed.

[1]  Nicholas C. Wormald,et al.  Reconstruction of Rooted Trees From Subtrees , 1996, Discret. Appl. Math..

[2]  Matthew W. Hahn,et al.  Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution , 2007, Genome Biology.

[3]  Peter F. Stadler,et al.  Simulation of gene family histories , 2014, BMC Bioinformatics.

[4]  Erik L. L. Sonnhammer,et al.  InParanoid 6: eukaryotic ortholog clusters with inparalogs , 2007, Nucleic Acids Res..

[5]  Katharina T. Huber,et al.  Basic Phylogenetic Combinatorics , 2011 .

[6]  Oliver Eulenstein,et al.  Locating Large-Scale Gene Duplication Events through Reconciled Trees: Implications for Identifying Ancient Polyploidy Events in Plants , 2009, J. Comput. Biol..

[7]  Leo Goodstadt,et al.  Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and Human , 2006, PLoS Comput. Biol..

[8]  Paola Bonizzoni,et al.  Reconciling a gene tree to a species tree under the duplication cost model , 2005, Theor. Comput. Sci..

[9]  Bengt Sennblad,et al.  Bayesian gene/species tree reconciliation and orthology analysis using MCMC , 2003, ISMB.

[10]  Bang Ye Wu,et al.  Constructing the Maximum Consensus Tree from Rooted Triples , 2004, J. Comb. Optim..

[11]  Leszek P. Pryszcz,et al.  MetaPhOrs: orthology and paralogy predictions from multiple phylogenetic evidence using a consistency-based confidence score , 2010, Nucleic acids research.

[12]  Andreas W. M. Dress,et al.  Recovering Symbolically Dated, Rooted Trees from Symbolic Ultrametrics , 1998 .

[13]  Wing-Kin Sung,et al.  Inferring phylogenetic relationships avoiding forbidden rooted triplets , 2006, APBC.

[14]  D. Sankoff,et al.  An efficient algorithm for supertrees , 1995 .

[15]  Jerzy Tiuryn,et al.  DLS-trees: A model of evolutionary scenarios , 2006, Theor. Comput. Sci..

[16]  Jesper Jansson,et al.  On the Complexity of Inferring Rooted Evolutionary Trees , 2001, Electron. Notes Discret. Math..

[17]  Leo van Iersel,et al.  Uniqueness, Intractability and Exact Algorithms: Reflections on Level-k Phylogenetic Networks , 2007, J. Bioinform. Comput. Biol..

[18]  Christophe Dessimoz,et al.  Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods , 2009, PLoS Comput. Biol..

[19]  Colin N. Dewey,et al.  BUCKy: Gene tree/species tree reconciliation with Bayesian concordance analysis , 2010, Bioinform..

[20]  Charles Semple,et al.  Phylogenetic Supertrees , 2004, Computational Biology.

[21]  Jaroslaw Byrka,et al.  New Results on Optimizing Rooted Triplets Consistency , 2008, ISAAC.

[22]  W. Fitch Homology a personal view on some of the problems. , 2000, Trends in genetics : TIG.

[23]  A. Brandstädt,et al.  Graph Classes: A Survey , 1987 .

[24]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[25]  Kunihiko Sadakane,et al.  Rooted Maximum Agreement Supertrees , 2004, LATIN.

[26]  Kimmen Sjölander,et al.  Berkeley PHOG: PhyloFacts orthology group prediction web server , 2009, Nucleic Acids Res..

[27]  Steven Kelk,et al.  Worst-case optimal approximation algorithms for maximizing triplet consistency within phylogenetic networks , 2007, J. Discrete Algorithms.

[28]  Oliver Eulenstein,et al.  The multiple gene duplication problem revisited , 2008, ISMB.

[29]  Nadia El-Mabrouk,et al.  New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees , 2009, RECOMB.

[30]  Emilio Hernández-García,et al.  An Age Dependent Branching Model for Macroevolution , 2012 .

[31]  Temple F. Smith,et al.  Reconstruction of ancient molecular phylogeny. , 1996, Molecular phylogenetics and evolution.

[32]  R. Page,et al.  From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. , 1997, Molecular phylogenetics and evolution.

[33]  Tandy J. Warnow,et al.  Constructing a Tree from Homeomorphic Subtrees, with Applications to Computational Evolutionary Biology , 1996, SODA '96.

[34]  Katharina T. Huber,et al.  Orthology relations, symbolic ultrametrics, and cographs , 2013, Journal of mathematical biology.

[35]  Louxin Zhang,et al.  On a Mirkin-Muchnik-Smith Conjecture for Comparing Molecular Phylogenies , 1997, J. Comput. Biol..

[36]  Cédric Chauve,et al.  Space of Gene/Species Trees Reconciliations and Parsimonious Models , 2009, J. Comput. Biol..

[37]  M. Ruggero,et al.  Similarity of Traveling-Wave Delays in the Hearing Organs of Humans and Other Tetrapods , 2007, Journal for the Association for Research in Otolaryngology.

[38]  Charles Semple,et al.  Reconstructing Minimal Rooted Trees , 2003, Discret. Appl. Math..

[39]  Andreas Prlic,et al.  Ensembl 2007 , 2006, Nucleic Acids Res..

[40]  Alfred V. Aho,et al.  Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions , 1981, SIAM J. Comput..

[41]  Nadia El-Mabrouk,et al.  Gene Family Evolution by Duplication, Speciation and Loss , 2022 .

[42]  Andrzej Lingas,et al.  The Complexity of Inferring a Minimally Resolved Phylogenetic Supertree , 2010, WABI.

[43]  Sonja J. Prohaska,et al.  Proteinortho: Detection of (Co-)orthologs in large-scale analysis , 2011, BMC Bioinformatics.

[44]  M. Steel,et al.  Extension Operations on Sets of Leaf-Labeled Trees , 1995 .

[45]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.