Imputing Supertrees and Supernetworks from Quartets

Inferring species phylogenies is an important part of understanding molecular evolution. Even so, it is well known that an accurate phylogenetic tree reconstruction for a single gene does not always necessarily correspond to the species phylogeny. One commonly accepted strategy to cope with this problem is to sequence many genes; the way in which to analyze the resulting collection of genes is somewhat more contentious. Supermatrix and supertree methods can be used, although these can suppress conflicts arising from true differences in the gene trees caused by processes such as lineage sorting, horizontal gene transfer, or gene duplication and loss. In 2004, Huson et al. (IEEE/ACM Trans. Comput. Biol. Bioinformatics 1:151-158) presented the Z-closure method that can circumvent this problem by generating a supernetwork as opposed to a supertree. Here we present an alternative way for generating supernetworks called Q-imputation. In particular, we describe a method that uses quartet information to add missing taxa into gene trees. The resulting trees are subsequently used to generate consensus networks, networks that generalize strict and majority-rule consensus trees. Through simulations and application to real data sets, we compare Q-imputation to the matrix representation with parsimony (MRP) supertree method and Z-closure, and demonstrate that it provides a useful complementary tool.

[1]  Daniel H. Huson,et al.  Phylogenetic Super-Networks from Partial Trees , 2004, IEEE ACM Trans. Comput. Biol. Bioinform..

[2]  O. Bininda-Emonds,et al.  The evolution of supertrees. , 2004, Trends in ecology & evolution.

[3]  O. Bininda-Emonds Phylogenetic Supertrees: Combining Information To Reveal The Tree Of Life , 2004 .

[4]  Rob DeSalle,et al.  Resolution of a supertree/supermatrix paradox. , 2002, Systematic biology.

[5]  Vincent Moulton,et al.  Using consensus networks to visualize contradictory evidence for species phylogeny. , 2004, Molecular biology and evolution.

[6]  O. Bininda-Emonds,et al.  Trees versus characters and the supertree/supermatrix "paradox". , 2004, Systematic biology.

[7]  Vincent Moulton,et al.  Spectronet: a package for computing spectra and median networks. , 2002, Applied bioinformatics.

[8]  Mark Wilkinson,et al.  Measuring support and finding unsupported relationships in supertrees. , 2005, Systematic biology.

[9]  A. Dress,et al.  Split decomposition: a new and useful approach to phylogenetic analysis of distance data. , 1992, Molecular phylogenetics and evolution.

[10]  K. Crandall,et al.  Intraspecific gene genealogies: trees grafting into networks. , 2001, Trends in ecology & evolution.

[11]  J. L. Gittleman,et al.  Building large trees by combining phylogenetic information: a complete phylogeny of the extant Carnivora (Mammalia) , 1999, Biological reviews of the Cambridge Philosophical Society.

[12]  Vincent Moulton,et al.  Proceedings of the SMBE Tri-National Young Investigators' Workshop 2005. Improved consensus network techniques for genome-scale phylogeny. , 2006, Molecular biology and evolution.

[13]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[14]  Luay Nakhleh,et al.  Phylogenetic networks , 2004 .

[15]  Thomas Mailund,et al.  Computing the Quartet Distance Between Trees of Arbitrary Degree , 2005, WABI.

[16]  O. Bininda-Emonds,et al.  Novel versus unsupported clades: assessing the qualitative support for clades in MRP supertrees. , 2003, Systematic biology.

[17]  Daniel H. Huson,et al.  Reducing Distortion in Phylogenetic Networks , 2006, WABI.

[18]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[19]  O. Bininda-Emonds,et al.  Properties of matrix representation with parsimony analyses. , 1998, Systematic biology.

[20]  John Gatesy,et al.  Inconsistencies in arguments for the supertree approach: supermatrices versus supertrees of Crocodylia. , 2004, Systematic biology.

[21]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[22]  G. Yule,et al.  A Mathematical Theory of Evolution Based on the Conclusions of Dr. J. C. Willis, F.R.S. , 1925 .

[23]  François-Joseph Lapointe,et al.  A weighted least-squares approach for inferring phylogenies from incomplete distance matrices , 2004, Bioinform..

[24]  M. Ragan Phylogenetic inference based on matrix representation of trees. , 1992, Molecular phylogenetics and evolution.

[25]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[26]  Loren H Rieseberg,et al.  Reconstructing patterns of reticulate evolution in plants. , 2004, American journal of botany.

[27]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[28]  Frédéric Delsuc,et al.  Visualizing conflicting evolutionary hypotheses in large collections of trees: using consensus networks to study the origins of placentals and hexapods. , 2005, Systematic biology.

[29]  Vincent Moulton,et al.  Consensus Networks: A Method for Visualising Incompatibilities in Collections of Trees , 2003, WABI.

[30]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[31]  M. Kennedy,et al.  SEABIRD SUPERTREES: COMBINING PARTIAL ESTIMATES OF PROCELLARIIFORM PHYLOGENY , 2002 .

[32]  J. Banks,et al.  Dissecting the ancient rapid radiation of microgastrine wasp genera using additional nuclear genes , 2006, Molecular Phylogenetics and Evolution.