Polynomial Supertree Methods Revisited

Supertree methods allow to reconstruct large phylogenetic trees by combining smaller trees with overlapping leaf sets into one, more comprehensive supertree. The most commonly used supertree method, matrix representation with parsimony (MRP), produces accurate supertrees but is rather slow due to the underlying hard optimization problem. In this paper, we present an extensive simulation study comparing the performance of MRP and the polynomial supertree methods MinCut Supertree, Modified MinCut Supertree, Build-with-distances, PhySIC, PhySIC_IST, and super distance matrix. We consider both quality and resolution of the reconstructed supertrees. Our findings illustrate the tradeoff between accuracy and running time in supertree construction, as well as the pros and cons of voting- and veto-based supertree approaches. Based on our results, we make some general suggestions for supertree methods yet to come.

[1]  Arndt von Haeseler,et al.  Accuracy of phylogeny reconstruction methods combining overlapping gene data sets , 2010, Algorithms for Molecular Biology.

[2]  Tandy J. Warnow,et al.  An experimental study of Quartets MaxCut and other supertree methods , 2010, Algorithms for Molecular Biology.

[3]  Tandy J. Warnow,et al.  A simulation study comparing supertree and combined analysis methods using SMIDGen , 2009, Algorithms for Molecular Biology.

[4]  Sebastian Böcker,et al.  EPoS: a modular software framework for phylogenetic analysis , 2008, Bioinform..

[5]  Vincent Berry,et al.  PhySIC_IST: cleaning source trees to infer more informative supertrees , 2008, BMC Bioinformatics.

[6]  Olivier Gascuel,et al.  Fast NJ-like algorithms to deal with incomplete distance matrices , 2008, BMC Bioinformatics.

[7]  R. Beck,et al.  Phylogeny and divergence of the pinnipeds (Carnivora: Mammalia) assessed using a multigene dataset , 2007, BMC Evolutionary Biology.

[8]  Sylvain Guillemot,et al.  PhySIC: a veto supertree method with desirable properties. , 2007, Systematic biology.

[9]  Kate E. Jones,et al.  The delayed rise of present-day mammals , 1990, Nature.

[10]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[11]  Olivier Gascuel,et al.  SDM: a fast distance-based approach for (super) tree building in phylogenomics. , 2006, Systematic biology.

[12]  David Fernández-Baca,et al.  Minimum-flip supertrees: complexity and algorithms , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Mark Wilkinson,et al.  Discriminating supported and unsupported relationships in supertrees using triplets. , 2006, Systematic biology.

[14]  Claudine Levasseur,et al.  Total Evidence, Average Consensus and Matrix Representation with Parsimony: What a Difference Distances Make , 2006, Evolutionary bioinformatics online.

[15]  Pablo A. Goloboff,et al.  Minority rule supertrees? MRP, Compatibility, and Minimum Flip may display the least frequent groups , 2005 .

[16]  Sébastien Roch,et al.  A short proof that phylogenetic tree reconstruction by maximum likelihood is hard , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  S. J. Willson,et al.  Constructing rooted supertrees using distances , 2004, Bulletin of mathematical biology.

[18]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[19]  O. Bininda-Emonds Phylogenetic Supertrees: Combining Information To Reveal The Tree Of Life , 2004 .

[20]  Thomas Ludwig,et al.  Parallel Inference of a 10.000-Taxon Phylogeny with Maximum Likelihood , 2004, Euro-Par.

[21]  O. Bininda-Emonds,et al.  The evolution of supertrees. , 2004, Trends in ecology & evolution.

[22]  David Fernández-Baca,et al.  Performance of flip supertree construction with a heuristic algorithm. , 2004, Systematic biology.

[23]  O. Bininda-Emonds,et al.  Novel versus unsupported clades: assessing the qualitative support for clades in MRP supertrees. , 2003, Systematic biology.

[24]  Michael J. Sanderson,et al.  R8s: Inferring Absolute Rates of Molecular Evolution, Divergence times in the Absence of a Molecular Clock , 2003, Bioinform..

[25]  Diego Pol,et al.  Semi‐strict supertrees , 2002, Cladistics : the international journal of the Willi Hennig Society.

[26]  Roderic D. M. Page,et al.  Modified Mincut Supertrees , 2002, WABI.

[27]  Charles Semple,et al.  A supertree method for rooted trees , 2000, Discret. Appl. Math..

[28]  Daniel H. Huson,et al.  Disk-Covering, a Fast-Converging Method for Phylogenetic Tree Reconstruction , 1999, J. Comput. Biol..

[29]  Daniel H. Huson,et al.  Solving Large Scale Phylogenetic Problems using DCM2 , 1999, ISMB.

[30]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[31]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[32]  Allen G. Rodrigo,et al.  A comment on Baum's method for combining phylogenetic trees , 1993 .

[33]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[34]  William H. E. Day,et al.  Analysis of Quartet Dissimilarity Measures Between Undirected Phylogenetic Trees , 1986 .

[35]  A. D. Gordon Consensus supertrees: The synthesis of rooted trees containing overlapping sets of labeled leaves , 1986 .

[36]  R. Graham,et al.  The steiner problem in phylogeny is NP-complete , 1982 .

[37]  D. H. Colless,et al.  Predictivity and Stability in Classifications: some Comments on Recent Studies , 1981 .

[38]  Alfred V. Aho,et al.  Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions , 1981, SIAM J. Comput..

[39]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[40]  Donald H. Colless,et al.  Congruence Between Morphometric and Allozyme Data for Menidia Species: A Reappraisal , 1980 .

[41]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[42]  Nicolas Salamin,et al.  Comparative performance of supertree algorithms in large data sets using the soapberry family (Sapindaceae) as a case study. , 2011, Systematic biology.

[43]  O. Bininda-Emonds,et al.  The future of supertrees : bridging the gap with supermatrices 1 , 2010 .

[44]  Thylogale,et al.  THE AVERAGE CONSENSUS PROCEDURE: COMBINATION OF WEIGHTED TREES CONTAINING IDENTICAL OR OVERLAPPING SETS OF TAXA , 2009 .

[45]  J. Gatesy,et al.  The supermatrix approach to systematics. , 2007, Trends in ecology & evolution.

[46]  O. Bininda-Emonds,et al.  Supertree construction in the genomic age. , 2005, Methods in enzymology.

[47]  Tandy J. Warnow,et al.  Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees , 2004, IEEE Computer Society Computational Systems Bioinformatics Conference.

[48]  Mark A. Ragan,et al.  The MRP Method , 2004 .

[49]  Allen G. Rodrigo,et al.  An Assessment of Matrix Representation with Compatibility in Supertree Construction , 2004 .

[50]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[51]  Mark D. Wilkinson,et al.  A view of supertree methods , 2001, Bioconsensus.

[52]  ICHAEL,et al.  Assessment of the Accuracy of Matrix Representation with Parsimony Analysis Supertree Construction , 2001 .

[53]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[54]  M. Ragan,et al.  Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. , 1992, Bio Systems.