Assessment of the accuracy of matrix representation with parsimony analysis supertree construction.

Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, large-scale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, matrix representation with parsimony analysis (MRP), with respect to a (maximum parsimony) total evidence solution and a known model tree. When source trees overlap completely, MRP provided a reasonable approximation of the total evidence tree; agreement was usually > 85%. Performance improved slightly when using smaller, more numerous, or more congruent source trees, and especially when elements were weighted in proportion to the bootstrap frequencies of the nodes they represented on each source tree ("weighted MRP"). Although total evidence always estimated the model tree slightly better than nonweighted MRP methods, weighted MRP in turn usually out-performed total evidence slightly. When source studies were even moderately nonoverlapping (i.e., sharing only three-quarters of the taxa), the high proportion of missing data caused a loss in resolution that severely degraded the performance for all methods, including total evidence. In such cases, even combining more trees, which had positive effects elsewhere, did not improve accuracy. Instead, "seeding" the supertree or total evidence analyses with a single largely complete study improved performance substantially. This finding could be an important strategy for any studies that seek to combine phylogenetic information. Overall, our results suggest that MRP supertree construction provides a reasonable approximation of a total evidence solution and that weighted MRP should be used whenever possible.

[1]  J. L. Gittleman,et al.  Building large trees by combining phylogenetic information: a complete phylogeny of the extant Carnivora (Mammalia) , 1999, Biological reviews of the Cambridge Philosophical Society.

[2]  J. Doyle,et al.  Gene Trees and Species Trees: Molecular Systematics as One-Character Taxonomy , 1992 .

[3]  Michael M. Miyamoto,et al.  Molecular and Morphological Supertrees for Eutherian (Placental) Mammals , 2001, Science.

[4]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[5]  Daniel R. Brooks,et al.  Hennig's Parasitological Method: A Proposed Solution , 1981 .

[6]  A. Rodrigo,et al.  Likelihood-based tests of topologies in phylogenetics. , 2000, Systematic biology.

[7]  W. Maddison RECONSTRUCTING CHARACTER EVOLUTION ON POLYTOMOUS CLADOGRAMS , 1989, Cladistics : the international journal of the Willi Hennig Society.

[8]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[9]  F. Ronquist Matrix representation of trees, redundancy, and weighting , 1996 .

[10]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[11]  M. Ragan,et al.  Reply to A. G. Rodrigo's "A Comment on Baum's Method for Combining Phylogenetic Trees" , 1993 .

[12]  A. Purvis,et al.  Comparative Primate Socioecology: Phylogenetically independent comparisons and primate phylogeny , 1999 .

[13]  Future trypanosomatid phylogenies: refined homologies, supertrees and networks. , 2000, Memorias do Instituto Oswaldo Cruz.

[14]  D. H. Colless,et al.  Predictivity and Stability in Classifications: some Comments on Recent Studies , 1981 .

[15]  M. Ragan Phylogenetic inference based on matrix representation of trees. , 1992, Molecular phylogenetics and evolution.

[16]  D. Maddison,et al.  NEXUS: an extensible file format for systematic information. , 1997, Systematic biology.

[17]  Andy Purvis,et al.  Phylogenetic supertrees: Assembling the trees of life. , 1998, Trends in ecology & evolution.

[18]  J. Huelsenbeck,et al.  Application and accuracy of molecular phylogenies. , 1994, Science.

[19]  A. Kluge A Concern for Evidence and a Phylogenetic Hypothesis of Relationships among Epicrates (Boidae, Serpentes) , 1989 .

[20]  Fred R. McMorris,et al.  Consensusn-trees , 1981 .

[21]  N. Platnick,et al.  ON MISSING ENTRIES IN CLADISTIC ANALYSIS , 1991 .

[22]  O. Bininda-Emonds,et al.  Properties of matrix representation with parsimony analyses. , 1998, Systematic biology.

[23]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[24]  Michael J. Sanderson,et al.  MOLECULAR PHYLOGENY OF THE "TEMPERATE HERBACEOUS TRIBES" OF PAPILIONOID LEGUMES: A SUPERTREE APPROACH , 2000 .

[25]  Donald H. Colless,et al.  Congruence Between Morphometric and Allozyme Data for Menidia Species: A Reappraisal , 1980 .

[26]  Andy Purvis,et al.  A Modification to Baum and Ragan's Method for Combining Phylogenetic Trees , 1995 .

[27]  I. Kitching Cladistics: The Theory and Practice of Parsimony Analysis , 1998 .

[28]  D. Swofford When are phylogeny estimates from molecular and morphological data incongruent , 1991 .

[29]  William H. E. Day,et al.  A formalization of consensus index methods , 1985 .

[30]  M. Steel,et al.  Distributions of Tree Comparison Metrics—Some New Results , 1993 .

[31]  A. Purvis A composite estimate of primate phylogeny. , 1995, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[32]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[33]  A. Rodrigo On combining cladograms , 1996 .

[34]  J. Wiens Does adding characters with missing data increase or decrease phylogenetic accuracy? , 1998, Systematic biology.

[35]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[36]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[37]  Mark Wilkinson,et al.  Coping with Abundant Missing Entries in Phylogenetic Inference Using Parsimony , 1995 .

[38]  Henri Poincaré,et al.  Second Complément à l'Analysis Situs , 1900 .

[39]  Allen G. Rodrigo,et al.  A comment on Baum's method for combining phylogenetic trees , 1993 .