Supertree bootstrapping methods for assessing phylogenetic variation among genes in genome-scale data sets.

Nonparamtric bootstrapping methods may be useful for assessing confidence in a supertree inference. We examined the performance of two supertree bootstrapping methods on four published data sets that each include sequence data from more than 100 genes. In "input tree bootstrapping," input gene trees are sampled with replacement and then combined in replicate supertree analyses; in "stratified bootstrapping," trees from each gene's separate (conventional) bootstrap tree set are sampled randomly with replacement and then combined. Generally, support values from both supertree bootstrap methods were similar or slightly lower than corresponding bootstrap values from a total evidence, or supermatrix, analysis. Yet, supertree bootstrap support also exceeded supermatrix bootstrap support for a number of clades. There was little overall difference in support scores between the input tree and stratified bootstrapping methods. Results from supertree bootstrapping methods, when compared to results from corresponding supermatrix bootstrapping, may provide insights into patterns of variation among genes in genome-scale data sets.

[1]  O. Bininda-Emonds Phylogenetic Supertrees: Combining Information To Reveal The Tree Of Life , 2004 .

[2]  J. G. Burleigh,et al.  Identifying optimal incomplete phylogenetic data sets from sequence databases. , 2005, Molecular phylogenetics and evolution.

[3]  Kate E. Jones,et al.  Chapter 12 GARBAGE IN , GARBAGE OUT Data issues in supertree construction , 2004 .

[4]  Simon A. A. Travers,et al.  Does a tree–like phylogeny only exist at the tips in the prokaryotes? , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[5]  M J Sanderson,et al.  Improved bootstrap confidence limits in large-scale phylogenies, with an example from Neo-Astragalus (Leguminosae). , 2000, Systematic biology.

[6]  James O. McInerney,et al.  Clann: investigating phylogenetic information through supertree analyses , 2005, Bioinform..

[7]  A. Vogler,et al.  The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data. , 2004, Molecular phylogenetics and evolution.

[8]  D. Littlewood,et al.  Interrelationships of the Platyhelminthes , 2001 .

[9]  E. Denamur,et al.  Decreasing the effects of horizontal gene transfer on bacterial phylogeny: the Escherichia coli case study. , 2004, Molecular phylogenetics and evolution.

[10]  G Perrière,et al.  Bacterial molecular phylogeny using supertree approach. , 2001, Genome informatics. International Conference on Genome Informatics.

[11]  F. Ronquist Matrix representation of trees, redundancy, and weighting , 1996 .

[12]  P. Holland,et al.  Phylogenomics of eukaryotes: impact of missing data on large alignments. , 2004, Molecular biology and evolution.

[13]  D. Penny,et al.  Branch and bound algorithms to determine minimal evolutionary trees , 1982 .

[14]  Oliver Eulenstein,et al.  The shape of supertrees to come: tree shape related properties of fourteen supertree methods. , 2005, Systematic biology.

[15]  J. Farris,et al.  Regular ArticlePARSIMONY JACKKNIFING OUTPERFORMS NEIGHBOR-JOINING , 1996 .

[16]  O. Bininda-Emonds,et al.  Novel versus unsupported clades: assessing the qualitative support for clades in MRP supertrees. , 2003, Systematic biology.

[17]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[18]  Mark A. Ragan,et al.  The MRP Method , 2004 .

[19]  J. L. Gittleman,et al.  The (Super)Tree of Life: Procedures, Problems, and Prospects , 2002 .

[20]  C. Cunningham Is congruence between data partitions a reliable predictor of phylogenetic accuracy? Empirically testing an iterative procedure for choosing among phylogenetic methods. , 1997, Systematic biology.

[21]  Andy Purvis,et al.  A Modification to Baum and Ragan's Method for Combining Phylogenetic Trees , 1995 .

[22]  M. Ragan Phylogenetic inference based on matrix representation of trees. , 1992, Molecular phylogenetics and evolution.

[23]  Olaf R. P. Bininda-Emonds,et al.  Garbage in, Garbage out , 2004 .

[24]  J. McInerney,et al.  The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa. , 2005, Molecular biology and evolution.

[25]  J. Bull,et al.  Partitioning and combining data in phylogenetic analysis , 1993 .

[26]  K. Bremer THE LIMITS OF AMINO ACID SEQUENCE DATA IN ANGIOSPERM PHYLOGENETIC RECONSTRUCTION , 1988, Evolution; international journal of organic evolution.

[27]  Bradley Efron,et al.  Missing Data, Imputation, and the Bootstrap , 1994 .

[28]  D. Soltis,et al.  Comparison of three methods for estimating internal support on phylogenetic trees. , 2000, Systematic biology.

[29]  Andy Purvis,et al.  Phylogenetic supertrees: Assembling the trees of life. , 1998, Trends in ecology & evolution.

[30]  Mark S. Springer,et al.  Which Mammalian Supertree to Bark Up? , 2001, Science.

[31]  J. G. Burleigh,et al.  Prospects for Building the Tree of Life from Large Sequence Databases , 2004, Science.

[32]  Roderic D. M. Page,et al.  Going nuclear: gene family evolution and vertebrate phylogeny reconciled , 2002, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[33]  M. Novacek,et al.  Mammalian phylogeny: Genes and supertrees , 2001, Current Biology.

[34]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[35]  Rob DeSalle,et al.  Resolution of a supertree/supermatrix paradox. , 2002, Systematic biology.

[36]  Roderic D. M. Page,et al.  Taxonomy, Supertrees, and the Tree of Life , 2004 .

[37]  E. Koonin,et al.  Coelomata and not Ecdysozoa: evidence from genome-wide phylogenetic analysis. , 2003, Genome research.

[38]  M. Ragan,et al.  Reply to A. G. Rodrigo's "A Comment on Baum's Method for Combining Phylogenetic Trees" , 1993 .

[39]  Junhyong Kim,et al.  Separate Versus Combined Analysis of Phylogenetic Evidence , 1995 .

[40]  Michael M. Miyamoto,et al.  Molecular and Morphological Supertrees for Eutherian (Placental) Mammals , 2001, Science.

[41]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.

[42]  R. Baker,et al.  Hidden likelihood support in genomic data: can forty-five wrongs make a right? , 2005, Systematic biology.

[43]  John Gatesy,et al.  Inconsistencies in arguments for the supertree approach: supermatrices versus supertrees of Crocodylia. , 2004, Systematic biology.

[44]  J. Doyle,et al.  Gene Trees and Species Trees: Molecular Systematics as One-Character Taxonomy , 1992 .

[45]  M. Springer,et al.  A Critique of Matrix Representation with Parsimony Supertrees , 2004 .

[46]  Arnold G. Kluge,et al.  A Numerical Approach to Phylogenetic Systematics , 1970 .

[47]  Michael J. Sanderson,et al.  R8s: Inferring Absolute Rates of Molecular Evolution, Divergence times in the Absence of a Molecular Clock , 2003, Bioinform..

[48]  Hirohisa Kishino,et al.  Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Terry Gaasterland,et al.  The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium, Entamoeba, and Mastigamoeba , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[50]  N. Moran,et al.  From Gene Trees to Organismal Phylogeny in Prokaryotes:The Case of the γ-Proteobacteria , 2003, PLoS biology.

[51]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[52]  C. Bult,et al.  TESTING SIGNIFICANCE OF INCONGRUENCE , 1994 .

[53]  M J Sanderson,et al.  Assessment of the accuracy of matrix representation with parsimony analysis supertree construction. , 2001, Systematic biology.

[54]  O. Bininda-Emonds,et al.  The evolution of supertrees. , 2004, Trends in ecology & evolution.

[55]  R. Olmstead,et al.  A simulation study of reduced tree-search effort in bootstrap resampling analysis. , 2000, Systematic biology.

[56]  M. Kennedy,et al.  SEABIRD SUPERTREES: COMBINING PARTIAL ESTIMATES OF PROCELLARIIFORM PHYLOGENY , 2002 .

[57]  J. Dopazo,et al.  Genome-scale evidence of the nematode-arthropod clade , 2005, Genome Biology.

[58]  H. Philippe,et al.  Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. , 2005, Molecular biology and evolution.

[59]  T. Gojobori,et al.  Bmc Evolutionary Biology the Evolutionary Position of Nematodes , 2022 .

[60]  G. Pertea,et al.  Cross-referencing eukaryotic genomes: TIGR Orthologous Gene Alignments (TOGA). , 2002, Genome research.

[61]  Mark Wilkinson,et al.  Matrix representation with parsimony, taxonomic congruence, and total evidence. , 2002, Systematic biology.

[62]  Prachi Shah,et al.  Evolutionary sequence analysis of complete eukaryote genomes , 2005, BMC Bioinformatics.

[63]  R DeSalle,et al.  Multiple sources of character information and the phylogeny of Hawaiian drosophilids. , 1997, Systematic biology.

[64]  R. Baker,et al.  Corroboration among Data Sets in Simultaneous Analysis: Hidden Support for Phylogenetic Relationships among Higher Level Artiodactyl Taxa , 1999, Cladistics : the international journal of the Willi Hennig Society.

[65]  J. Bull,et al.  Combining data in phylogenetic analysis. , 1996, Trends in ecology & evolution.