Species Trees from Gene Trees Despite a High Rate of Lateral Genetic Transfer: A Tight Bound (Extended Abstract)

Reconstructing the tree of life from molecular sequences is a fundamental problem in computational biology. Modern data sets often contain a large number of genes which can complicate the reconstruction problem due to the fact that different genes may undergo different evolutionary histories. This is the case in particular in the presence of lateral genetic transfer (LGT), whereby a gene is inherited from a distant species rather than an immediate ancestor. Such an event produces a gene tree which is distinct from (but related to) the species phylogeny. In previous work, a stochastic model of LGT was introduced and it was shown that the species phylogeny can be reconstructed from gene trees despite surprisingly high rates of LGT. Both lower and upper bounds on this rate were obtained, but a large gap remained. Here we close this gap, up to a constant. Specifically, we show that the species phylogeny can be reconstructed perfectly even when each edge of the tree has a constant probability of being the location of an LGT event. Our new reconstruction algorithm builds the tree recursively from the leaves. We also provide a matching bound in the negative direction (up to a con-

[1]  C R Woese,et al.  The phylogeny of prokaryotes. , 1980, Science.

[2]  T. Lindvall Lectures on the Coupling Method , 1992 .

[3]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[4]  W. Maddison Gene Trees in Species Trees , 1997 .

[5]  P. Erdös,et al.  A few logs suffice to build (almost) all trees (l): part I , 1997 .

[6]  Daniel L. Hartl,et al.  Genetics: Principles and Analysis , 1997 .

[7]  Paul W. Goldberg,et al.  Evolutionary trees can be learned in polynomial time in the two-state general Markov model , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[8]  Tandy J. Warnow,et al.  A Few Logs Suffice to Build (almost) All Trees: Part II , 1999, Theor. Comput. Sci..

[9]  Y. Peres Probability on Trees: An Introductory Climb , 1999 .

[10]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[11]  Junhyong Kim,et al.  A Tree Obscured By Vines: Horizontal Gene Transfer and the Median Tree Method of Estimating Species Phylogeny , 2000, Pacific Symposium on Biocomputing.

[12]  Elchanan Mossel Reconstruction on Trees: Beating the Second Eigenvalue , 2001 .

[13]  Paul W. Goldberg,et al.  Evolutionary Trees Can be Learned in Polynomial Time in the Two-State General Markov Model , 2001, SIAM J. Comput..

[14]  László A. Székely,et al.  Inverting Random Functions II: Explicit Bounds for Discrete Maximum Likelihood Estimation, with Applications , 2002, SIAM J. Discret. Math..

[15]  Li Zhang,et al.  On the complexity of distance-based evolutionary tree reconstruction , 2003, SODA '03.

[16]  Elchanan Mossel,et al.  On the Impossibility of Reconstructing Ancestral Data and Phylogenies , 2003, J. Comput. Biol..

[17]  Elchanan Mossel Phase transitions in phylogeny , 2003, Transactions of the American Mathematical Society.

[18]  M. Suchard Stochastic Models for Horizontal Gene Transfer , 2005, Genetics.

[19]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[20]  Elchanan Mossel,et al.  Evolutionary trees and the Ising model on the Bethe lattice: a proof of Steel’s conjecture , 2005, ArXiv.

[21]  W. Doolittle,et al.  Do orthologous gene phylogenies really support tree-thinking? , 2005, BMC Evolutionary Biology.

[22]  Elchanan Mossel,et al.  Optimal phylogenetic reconstruction , 2005, STOC '06.

[23]  Elchanan Mossel,et al.  Learning nonsingular phylogenies and hidden Markov models , 2005, Symposium on the Theory of Computing.

[24]  W. Doolittle,et al.  Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. , 2006, Genome research.

[25]  Sagi Snir,et al.  Maximum likelihood of phylogenetic networks , 2006, Bioinform..

[26]  A. von Haeseler,et al.  A likelihood framework to measure horizontal gene transfer. , 2007, Molecular biology and evolution.

[27]  N. Galtier A model of horizontal gene transfer and the bacterial phylogeny problem. , 2007, Systematic biology.

[28]  Elchanan Mossel Distorted Metrics on Trees and Phylogenetic Forests , 2007, TCBB.

[29]  Eric Bapteste,et al.  INAUGURAL ARTICLE by a Recently Elected Academy Member:Pattern pluralism and the Tree of Life hypothesis , 2007 .

[30]  N. Galtier,et al.  Dealing with incongruence in phylogenomic analyses , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[31]  Elchanan Mossel,et al.  Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep , 2008, SIAM J. Discret. Math..

[32]  Sagi Snir,et al.  Parsimony Score of Phylogenetic Networks: Hardness Results and a Linear-Time Heuristic , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[33]  Alexandr Andoni,et al.  Global Alignment of Molecular Sequences via Ancestral State Reconstruction , 2009, ICS.

[34]  Constantinos Daskalakis,et al.  Alignment-Free Phylogenetic Reconstruction , 2010, RECOMB.

[35]  S. Roch Toward Extracting All Phylogenetic Information from Matrices of Evolutionary Distances , 2010, Science.

[36]  Elchanan Mossel,et al.  On the Inference of Large Phylogenies with Long Branches: How Long Is Too Long? , 2010, Bulletin of mathematical biology.

[37]  Satish Rao,et al.  Fast Phylogeny Reconstruction Through Learning of Ancestral Sequences , 2008, Algorithmica.

[38]  Sagi Snir,et al.  Recovering the Tree-Like Trend of Evolution Despite Extensive Lateral Genetic Transfer: A Probabilistic Analysis , 2012, RECOMB.

[39]  Mike Steel,et al.  The standard lateral gene transfer model is statistically consistent for pectinate four-taxon trees. , 2013, Journal of theoretical biology.

[40]  L. Nakhleh,et al.  Computational approaches to species phylogeny inference and gene tree reconciliation. , 2013, Trends in ecology & evolution.

[41]  Simone Linz,et al.  Identifying a species tree subject to random lateral gene transfer. , 2012, Journal of theoretical biology.

[42]  Sagi Snir,et al.  Recovering the Tree-Like Trend of Evolution Despite Extensive Lateral Genetic Transfer: A Probabilistic Analysis , 2012, RECOMB.