Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees

Motivation: Gene duplication (D), transfer (T), loss (L) and incomplete lineage sorting (I) are crucial to the evolution of gene families and the emergence of novel functions. The history of these events can be inferred via comparison of gene and species trees, a process called reconciliation, yet current reconciliation algorithms model only a subset of these evolutionary processes. Results: We present an algorithm to reconcile a binary gene tree with a nonbinary species tree under a DTLI parsimony criterion. This is the first reconciliation algorithm to capture all four evolutionary processes driving tree incongruence and the first to reconcile non-binary species trees with a transfer model. Our algorithm infers all optimal solutions and reports complete, temporally feasible event histories, giving the gene and species lineages in which each event occurred. It is fixed-parameter tractable, with polytime complexity when the maximum species outdegree is fixed. Application of our algorithms to prokaryotic and eukaryotic data show that use of an incomplete event model has substantial impact on the events inferred and resulting biological conclusions. Availability: Our algorithms have been implemented in Notung, a freely available phylogenetic reconciliation software package, available at http://www.cs.cmu.edu/~durand/Notung. Contact: mstolzer@andrew.cmu.edu

[1]  Luay Nakhleh,et al.  Gene Trees, Species Trees, and Species Networks , 2005 .

[2]  Matthias Platzer,et al.  Mapping human genetic ancestry. , 2007, Molecular biology and evolution.

[3]  Vincent Berry,et al.  Models, algorithms and programs for phylogeny reconciliation , 2011, Briefings Bioinform..

[4]  Eric Depiereux,et al.  2× genomes - depth does matter , 2010, Genome Biology.

[5]  L. Nakhleh Evolutionary Phylogenetic Networks: Models and Issues , 2010 .

[6]  Manolis Kellis,et al.  Unified modeling of gene duplication, loss, and coalescence using a locus tree. , 2012, Genome research.

[7]  Luay Nakhleh,et al.  Species Tree Inference by Minimizing Deep Coalescences , 2009, PLoS Comput. Biol..

[8]  W. Maddison,et al.  Inferring phylogeny despite incomplete lineage sorting. , 2006, Systematic biology.

[9]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[10]  J. Peter Gogarten,et al.  Intertwined Evolutionary Histories of Marine Synechococcus and Prochlorococcus marinus , 2009, Genome biology and evolution.

[11]  Darlene R. Goldstein,et al.  Meta-analysis and Combining Information in Genetics and Genomics , 2009 .

[12]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[13]  D. Huson,et al.  A Survey of Combinatorial Methods for Phylogenetic Networks , 2010, Genome biology and evolution.

[14]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[15]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.

[16]  Alan M. Moses,et al.  Widespread Discordance of Gene Trees with Species Tree in Drosophila: Evidence for Incomplete Lineage Sorting , 2006, PLoS genetics.

[17]  Michael T. Hallett,et al.  Simultaneous identification of duplications and lateral transfers , 2004, RECOMB.

[18]  B Vernot,et al.  Reconciliation with Non-Binary Species Trees , 2007, J. Comput. Biol..

[19]  S. Edwards IS A NEW AND GENERAL THEORY OF MOLECULAR SYSTEMATICS EMERGING? , 2009, Evolution; international journal of organic evolution.

[20]  Luay Nakhleh,et al.  Coalescent histories on phylogenetic networks and detection of hybridization despite incomplete lineage sorting. , 2011, Systematic biology.

[21]  W. P. Maddison,et al.  Mesquite: a modular system for evolutionary analysis. Version 2.01 (Build j28) , 2007 .

[22]  Timothy J. Harlow,et al.  Highways of gene sharing in prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Jan O Andersson Horizontal gene transfer between microbial eukaryotes. , 2009, Methods in molecular biology.

[24]  Ron Shamir,et al.  Detecting Highways of Horizontal Gene Transfer , 2010, RECOMB-CG.

[25]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[26]  Alexandre Antonelli,et al.  The origin of multicellularity in cyanobacteria , 2011, BMC Evolutionary Biology.

[27]  S. E. Lazic,et al.  Meta‐analysis and Combining Information in Genetics and Genomics , 2011 .

[28]  Matthew J. Betts,et al.  Optimal Gene Trees from Sequences and Species Trees Using a Soft Interpretation of Parsimony , 2006, Journal of Molecular Evolution.

[29]  Lawrence A. David,et al.  Rapid evolutionary innovation during an Archaean genetic expansion , 2011, Nature.

[30]  W. Doolittle,et al.  Lateral gene transfer , 2011, Current Biology.

[31]  Michael T. Hallett,et al.  Simultaneous Identification of Duplications and Lateral Gene Transfers , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[32]  Louxin Zhang,et al.  From Gene Trees to Species Trees II: Species Tree Inference by Minimizing Deep Coalescence Events , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[33]  Monica Riley,et al.  Evolution by leaps: gene duplication in bacteria , 2009, Biology Direct.

[34]  Sean R. Eddy,et al.  A simple algorithm to infer gene duplication and speciation events on a gene tree , 2001, Bioinform..

[35]  D. Maddison,et al.  Mesquite: a modular system for evolutionary analysis. Version 2.6 , 2009 .

[36]  Roderic D. M. Page,et al.  GeneTree: comparing gene and species phylogenies using reconciled trees , 1998, Bioinform..