PhySIC: a veto supertree method with desirable properties.

This paper focuses on veto supertree methods; i.e., methods that aim at producing a conservative synthesis of the relationships agreed upon by all source trees. We propose desirable properties that a supertree should satisfy in this framework, namely the non-contradiction property (PC) and the induction property (PI). The former requires that the supertree does not contain relationships that contradict one or a combination of the source topologies, whereas the latter requires that all topological information contained in the supertree is present in a source tree or collectively induced by several source trees. We provide simple examples to illustrate their relevance and that allow a comparison with previously advocated properties. We show that these properties can be checked in polynomial time for any given rooted supertree. Moreover, we introduce the PhySIC method (PHYlogenetic Signal with Induction and non-Contradiction). For k input trees spanning a set of n taxa, this method produces a supertree that satisfies the above-mentioned properties in O(kn(3) + n(4)) computing time. The polytomies of the produced supertree are also tagged by labels indicating areas of conflict as well as those with insufficient overlap. As a whole, PhySIC enables the user to quickly summarize consensual information of a set of trees and localize groups of taxa for which the data require consolidation. Lastly, we illustrate the behaviour of PhySIC on primate data sets of various sizes, and propose a supertree covering 95% of all primate extant genera. The PhySIC algorithm is available at http://atgc.lirmm.fr/cgi-bin/PhySIC.

[1]  D. Bryant Building trees, hunting for trees, and comparing trees : theory and methods in phylogenetic analysis , 1997 .

[2]  Mark Wilkinson,et al.  Measuring support and finding unsupported relationships in supertrees. , 2005, Systematic biology.

[3]  Hidetoshi Shimodaira,et al.  Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference , 1999, Molecular Biology and Evolution.

[4]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[5]  M. Gouy,et al.  A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history. , 2002, Genome research.

[6]  Olivier Gascuel,et al.  Inferring evolutionary trees with strong combinatorial evidence , 1997, Theor. Comput. Sci..

[7]  O. Bininda-Emonds,et al.  The evolution of supertrees. , 2004, Trends in ecology & evolution.

[8]  O. Bininda-Emonds Phylogenetic Supertrees: Combining Information To Reveal The Tree Of Life , 2004 .

[9]  D. A. Neumann Faithful consensus methods for n-trees , 1983 .

[10]  Vincent Berry,et al.  A Structured Family of Clustering and Tree Construction Methods , 2001, Adv. Appl. Math..

[11]  M. Steel The complexity of reconstructing trees from qualitative characters and subtrees , 1992 .

[12]  Thomas J Parsons,et al.  Ancient DNA from giant extinct lemurs confirms single origin of Malagasy primates. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  A. Dress,et al.  Reconstructing the shape of a tree from observed dissimilarity data , 1986 .

[14]  Andy Purvis,et al.  A Modification to Baum and Ragan's Method for Combining Phylogenetic Trees , 1995 .

[15]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[16]  M Steel,et al.  Simple but fundamental limitations on supertree and consensus tree methods. , 2000, Systematic biology.

[17]  D. Labie,et al.  Molecular Evolution , 1991, Nature.

[18]  Hans Zischler,et al.  Molecular cladistic markers in New World monkey phylogeny (Platyrrhini, Primates). , 2003, Molecular phylogenetics and evolution.

[19]  Charles Semple,et al.  Supertree Algorithms for Nested Taxa , 2004 .

[20]  Charles Semple,et al.  Fast computation of supertrees for compatible phylogenies with nested taxa. , 2006, Systematic biology.

[21]  E. N. Adams Consensus Techniques and the Comparison of Taxonomic Trees , 1972 .

[22]  Alfred V. Aho,et al.  Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions , 1981, SIAM J. Comput..

[23]  Mark Wilkinson,et al.  Discriminating supported and unsupported relationships in supertrees using triplets. , 2006, Systematic biology.

[24]  Roderic D. M. Page,et al.  Modified Mincut Supertrees , 2002, WABI.

[25]  Thylogale,et al.  THE AVERAGE CONSENSUS PROCEDURE: COMBINATION OF WEIGHTED TREES CONTAINING IDENTICAL OR OVERLAPPING SETS OF TAXA , 2009 .

[26]  E. -,et al.  Properties of Matrix Representation with Parsimony Analyses , 2000 .

[27]  Mike Steel,et al.  Closure operations in phylogenetics. , 2007, Mathematical biosciences.

[28]  Pablo A. Goloboff,et al.  Minority rule supertrees? MRP, Compatibility, and Minimum Flip may display the least frequent groups , 2005 .

[29]  Andy Purvis,et al.  Phylogenetic supertrees: Assembling the trees of life. , 1998, Trends in ecology & evolution.

[30]  Mark A. Ragan,et al.  The MRP Method , 2004 .

[31]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[32]  O. Bininda-Emonds,et al.  Novel versus unsupported clades: assessing the qualitative support for clades in MRP supertrees. , 2003, Systematic biology.

[33]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[34]  Charles Semple,et al.  Supertree Methods for Ancestral Divergence Dates and other Applications , 2004 .

[35]  E. Douzery,et al.  Primate phylogeny, evolutionary rate variations, and divergence times: a contribution from the nuclear gene IRBP. , 2004, American journal of physical anthropology.

[36]  Christophe Paul,et al.  On the Approximation of Computing Evolutionary Trees , 2005, COCOON.

[37]  Allen G. Rodrigo,et al.  An Assessment of Matrix Representation with Compatibility in Supertree Construction , 2004 .

[38]  James O. McInerney,et al.  Some Desiderata for Liberal Supertrees , 2004 .

[39]  A. D. Gordon Consensus supertrees: The synthesis of rooted trees containing overlapping sets of labeled leaves , 1986 .

[40]  J. Schmitz,et al.  Primate jumping genes elucidate strepsirrhine phylogeny. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[41]  A. Purvis A composite estimate of primate phylogeny. , 1995, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[42]  François Nicolas,et al.  Maximum agreement and compatible supertrees , 2004, J. Discrete Algorithms.

[43]  M. Wilkinson Common Cladistic Information and its Consensus Representation: Reduced Adams and Reduced Cladistic Consensus Trees and Profiles , 1994 .

[44]  Derek E Wildman,et al.  Moving primate genomics beyond the chimpanzee genome. , 2005, Trends in genetics : TIG.

[45]  M. Ragan,et al.  Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. , 1992, Bio Systems.

[46]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[47]  David M. Hillis,et al.  Faculty Opinions recommendation of From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-Proteobacteria. , 2003 .

[48]  Fred R. McMorris,et al.  Consensusn-trees , 1981 .

[49]  N. Moran,et al.  From Gene Trees to Organismal Phylogeny in Prokaryotes:The Case of the γ-Proteobacteria , 2003, PLoS biology.

[50]  Daniel H. Huson,et al.  Disk-Covering, a Fast-Converging Method for Phylogenetic Tree Reconstruction , 1999, J. Comput. Biol..

[51]  Mark Wilkinson,et al.  Matrix representation with parsimony, taxonomic congruence, and total evidence. , 2002, Systematic biology.

[52]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[53]  Diego Pol,et al.  Semi‐strict supertrees , 2002, Cladistics : the international journal of the Willi Hennig Society.

[54]  P. Chevret,et al.  Arrival and diversification of caviomorph rodents and platyrrhine primates in South America. , 2006, Systematic biology.

[55]  Jinchuan Xing,et al.  A mobile element based phylogeny of Old World monkeys. , 2005, Molecular phylogenetics and evolution.

[56]  Pamela S Soltis,et al.  Darwin's abominable mystery: Insights from a supertree of the angiosperms , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Sylvain Gaillard,et al.  Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics , 2006, BMC Bioinformatics.

[58]  Olivier Gascuel,et al.  SDM: a fast distance-based approach for (super) tree building in phylogenomics. , 2006, Systematic biology.

[59]  Sylvain Guillemot,et al.  Finding a largest subset of rooted triples identifying a tree is an NP-hard task Research Report LIRMM - RR-07010 , 2007 .

[60]  Charles Semple,et al.  A supertree method for rooted trees , 2000, Discret. Appl. Math..

[61]  David Fernández-Baca,et al.  Flipping: A supertree construction method , 2001, Bioconsensus.

[62]  Don E. Wilson,et al.  The Mammal Species of the World , 2009 .

[63]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.