Reconciliation With Nonbinary Gene Trees Revisited

By reconciling the phylogenetic tree of a gene family with the corresponding species tree, it is possible to infer lineage-specific duplications and losses with high confidence and hence to annotate orthologs and paralogs. The currently available reconciliation methods for nonbinary gene trees are computationally expensive for genome-scale applications. We present four O(|G|+|S|) algorithms to reconcile an arbitrary gene tree G with a binary species tree S in the duplication, loss, duploss (also known as mutation), and deep coalescence cost models, where |· | denotes the number of nodes in a tree. The improvement is achieved through two innovations: a linear-time computation of compressed child-image subtrees and efficient reconstruction of irreducible duplication histories. Our technique for child-image subtree compression also results in an order of magnitude speedup in runtime for the dynamic programming and Wagner parsimony--based methods for tree reconciliation in the affine cost model.

[1]  Uzi Vishkin,et al.  On Finding Lowest Common Ancestors: Simplification and Parallelization , 1988, AWOC.

[2]  Tandy J. Warnow,et al.  Large-Scale Multiple Sequence Alignment and Phylogeny Estimation , 2013, Models and Algorithms for Genome Evolution.

[3]  Manolis Kellis,et al.  Reconciliation Revisited: Handling Multiple Optima when Reconciling with Duplication, Transfer, and Loss , 2013, J. Comput. Biol..

[4]  Krister M. Swenson,et al.  An Optimal Reconciliation Algorithm for Gene Trees with Polytomies , 2012, WABI.

[5]  Zhi-Zhong Chen,et al.  Simultaneous Identification of Duplications, Losses, and Lateral Gene Transfers , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Manolis Kellis,et al.  Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss , 2012, Bioinform..

[7]  Yu Zheng,et al.  A Linear-Time Algorithm for Reconciliation of Non-binary Gene Tree and Binary Species Tree , 2013, COCOA.

[8]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[9]  Jerzy Tiuryn,et al.  DLS-trees: A model of evolutionary scenarios , 2006, Theor. Comput. Sci..

[10]  VishkinUzi,et al.  On finding lowest common ancestors: simplification and parallelization , 1988 .

[11]  Dannie Durand,et al.  Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees , 2012, Bioinform..

[12]  Arcady R. Mushegian,et al.  Computational methods for Gene Orthology inference , 2011, Briefings Bioinform..

[13]  P. Gács,et al.  Algorithms , 1992 .

[14]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[15]  N. Friedman,et al.  Natural history and evolutionary principles of gene duplication in fungi , 2007, Nature.

[16]  Dannie Durand,et al.  How old is my gene? , 2013, Trends in genetics : TIG.

[17]  Miklós Csürös,et al.  Ancestral Reconstruction by Asymmetric Wagner Parsimony over Continuous Characters and Squared Parsimony over Distributions , 2008, RECOMB-CG.

[18]  Luay Nakhleh,et al.  Species Tree Inference by Minimizing Deep Coalescences , 2009, PLoS Comput. Biol..

[19]  L. Mirsky A Dual of Dilworth's Decomposition Theorem , 1971 .

[20]  Kun-Mao Chao,et al.  Linear-Time Algorithms for the Multiple Gene Duplication Problems , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Robert E. Tarjan,et al.  An Efficient Parallel Biconnectivity Algorithm , 2011, SIAM J. Comput..

[22]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[23]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[24]  Guy Perrière,et al.  Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases , 2005, Bioinform..

[25]  Leo Goodstadt,et al.  Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and Human , 2006, PLoS Comput. Biol..

[26]  Alan M. Moses,et al.  Widespread Discordance of Gene Trees with Species Tree in Drosophila: Evidence for Incomplete Lineage Sorting , 2006, PLoS genetics.

[27]  Bengt Sennblad,et al.  The gene evolution model and computing its associated probabilities , 2009, JACM.

[28]  Dannie Durand,et al.  NOTUNG: A Program for Dating Gene Duplications and Optimizing Gene Family Trees , 2000, J. Comput. Biol..

[29]  Louxin Zhang,et al.  From Gene Trees to Species Trees II: Species Tree Inference by Minimizing Deep Coalescence Events , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[30]  M. Sanderson,et al.  Inferring angiosperm phylogeny from EST data with widespread gene duplication , 2007, BMC Evolutionary Biology.

[31]  Tandy J. Warnow,et al.  Algorithms for MDC-Based Multi-Locus Phylogeny Inference: Beyond Rooted Binary Gene Trees on Single Alleles , 2011, J. Comput. Biol..

[32]  Louxin Zhang,et al.  Effect of Incomplete Lineage Sorting On Tree-Reconciliation-Based Inference of Gene Duplication , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[33]  J. Farris Methods for Computing Wagner Trees , 1970 .

[34]  Nadia El-Mabrouk,et al.  New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees , 2009, RECOMB.

[35]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[36]  M. Gouy,et al.  Genome-scale coestimation of species and gene trees , 2013, Genome research.

[37]  Erik L. L. Sonnhammer,et al.  Automated ortholog inference from phylogenetic trees and calculation of orthology reliability , 2002, Bioinform..

[38]  David Sankoff,et al.  Locating the vertices of a steiner tree in an arbitrary metric space , 1975, Math. Program..

[39]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[40]  Oliver Eulenstein,et al.  Reconciling Gene Trees with Apparent Polytomies , 2006, COCOON.

[41]  Vincent Berry,et al.  Models, algorithms and programs for phylogeny reconciliation , 2011, Briefings Bioinform..

[42]  Louxin Zhang,et al.  On a Mirkin-Muchnik-Smith Conjecture for Comparing Molecular Phylogenies , 1997, J. Comput. Biol..

[43]  Dan Gusfield,et al.  Algorithms in Bioinformatics , 2002, Lecture Notes in Computer Science.

[44]  Oliver Eulenstein,et al.  Reconciling Phylogenetic Trees , 2011 .

[45]  Dannie Durand,et al.  A Hybrid Micro-Macroevolutionary Approach to Gene Tree Reconstruction , 2005, RECOMB.

[46]  G. Moore,et al.  Fitting the gene lineage into its species lineage , 1979 .