Visualizing Phylogenetic Treespace Using Cartographic Projections

Phylogenetic analysis is becoming an increasingly important tool for biological research. Applications include epidemiological studies, drug development, and evolutionary analysis. Phylogenetic search is a known NP-Hard problem. The size of the data sets which can be analyzed is limited by the exponential growth in the number of trees that must be considered as the problem size increases. A better understanding of the problem space could lead to better methods, which in turn could lead to the feasible analysis of more data sets. We present a definition of phylogenetic tree space and a visualization of this space that shows significant exploitable structure. This structure can be used to develop search methods capable of handling much larger datasets.

[1]  Mark J. Clement,et al.  On the use of cartographic projections in visualizing phylo-genetic tree space , 2010, Algorithms for Molecular Biology.

[2]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[3]  Hyrum Carroll,et al.  PSODA: Better Tasting and Less Filling Than PAUP , 2007 .

[4]  James F. Smith Phylogenetics of seed plants : An analysis of nucleotide sequences from the plastid gene rbcL , 1993 .

[5]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[6]  R. Meier,et al.  Software Review , 2005 .

[7]  M. Steel,et al.  Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees , 2001 .

[8]  Nina Amenta,et al.  Case study: visualizing sets of evolutionary trees , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[9]  C. Sing,et al.  Application of cladistics to the analysis of genotype-phenotype relationships , 1992, European Journal of Epidemiology.

[10]  D. Hillis,et al.  Analysis and visualization of tree space. , 2005, Systematic biology.

[11]  Tandy J. Warnow,et al.  Better Hill-Climbing Searches for Parsimony , 2003, WABI.

[12]  Derrick J. Zwickl Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion , 2006 .

[13]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[14]  K. Crandall,et al.  Multiple interspecies transmissions of human and simian T-cell leukemia/lymphoma virus type I sequences. , 1996, Molecular biology and evolution.

[15]  E. Delwart,et al.  Phylogenetic analysis of WNV in North American blood donors during the 2003-2004 epidemic seasons. , 2007, Virology.

[16]  David S. Johnson,et al.  The computational complexity of inferring rooted phylogenies by parsimony , 1986 .

[17]  Peter Adams,et al.  Sampling phylogenetic tree space with the generalized Gibbs sampler , 2005 .

[18]  Axel Hultman The topology of spaces of phylogenetic trees with symmetry , 2007, Discret. Math..

[19]  Louis J. Billera,et al.  Geometry of the Space of Phylogenetic Trees , 2001, Adv. Appl. Math..

[20]  Tamir Tuller,et al.  Maximum Likelihood of Evolutionary Trees Is Hard , 2005, RECOMB.

[21]  E. Boerwinkle,et al.  Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. , 1998, American journal of human genetics.

[22]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.