Maximum Parsimony for Tree Mixtures

With the number of sequenced genomes growing ever larger, it is now common practice to concatenate sequence alignments from several genomic loci as a first step to phylogenetic tree inference. However, as different loci may support different trees due to processes such as gene duplication and lineage sorting, it is important to better understand how commonly used phylogenetic inference methods behave on such "phylogenetic mixtures". Here we shall focus on how parsimony, one of the most popular methods for reconstructing phylogenetic trees, behaves for mixtures of two trees. In particular, we show that (i) the parsimony problem is NP-complete for mixtures of two trees, (ii) there are mixtures of two trees that have exponentially many (in the number of leaves) most parsimonious trees, and (iii) give an explicit description of the most parsimonious tree(s) and scores corresponding to the mixture of a pair of trees related by a single TBR operation.

[1]  Elchanan Mossel,et al.  Mixed-up Trees: the Structure of Phylogenetic Mixtures , 2007, Bulletin of mathematical biology.

[2]  M. Bordewich,et al.  Computing the Hybridization Number of Two Phylogenetic Trees Is Fixed-Parameter Tractable , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Eric Vigoda,et al.  Phylogeny of Mixture Models: Robustness of Maximum Likelihood and Non-Identifiable Distributions , 2006, J. Comput. Biol..

[4]  Elizabeth S. Allman,et al.  The Identifiability of Tree Topology for Phylogenetic Models, Including Covarion and Mixture Models , 2005, J. Comput. Biol..

[5]  F. James Rohlf,et al.  J. Felsenstein, Inferring Phylogenies, Sinauer Assoc., 2004, pp. xx + 664. , 2005, Journal of Classification.

[6]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[7]  K. Chao,et al.  Steiner Minimal Trees , 2005 .

[8]  Hans-Jürgen Bandelt,et al.  Invited Presentation: Median Hulls as Steiner Hulls in Rectilinear and Molecular Sequence Spaces , 2001, WG.

[9]  Sandi Klavzar,et al.  An Euler-type formula for median graphs , 1998, Discret. Math..

[10]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[11]  H. Bandelt,et al.  Mitochondrial portraits of human populations using median networks. , 1995, Genetics.

[12]  J. Hein Reconstructing evolution of sequences subject to recombination using parsimony. , 1990, Mathematical biosciences.

[13]  Roberto Tamassia,et al.  On Embedding a Graph in the Grid with the Minimum Number of Bends , 1987, SIAM J. Comput..

[14]  W. H. Day Computationally difficult parsimony problems in phylogenetic systematics , 1983 .

[15]  R. Graham,et al.  The steiner problem in phylogeny is NP-complete , 1982 .

[16]  David S. Johnson,et al.  The Rectilinear Steiner Tree Problem is NP Complete , 1977, SIAM Journal of Applied Mathematics.

[17]  F. Y. Wu Number of spanning trees on a lattice , 1977 .

[18]  M. Hanan,et al.  On Steiner’s Problem with Rectilinear Distance , 1966 .