δ Plots: A Tool for Analyzing Phylogenetic Distance Data

A method is described that allows the assessment of treelikeness of phylogenetic distance data before tree estimation. This method is related to statistical geometry as introduced by Eigen, Winkler-Oswatitsch, and Dress (1988 [Proc. Natl. Acad. Sci. USA. 85:5913-5917]), and in essence, displays a measure for treelikeness of quartets in terms of a histogram that we call a delta plot. This allows identification of nontreelike data and analysis of noisy data sets arising from processes such as, for example, parallel evolution, recombination, or lateral gene transfer. In addition to an overall assessment of treelikeness, individual taxa can be ranked by reference to the treelikeness of the quartets to which they belong. Removal of taxa on the basis of this ranking results in an increase in accuracy of tree estimation. Recombinant data sets are simulated, and the method is shown to be capable of identifying single recombinant taxa on the basis of distance information alone, provided the parents of the recombinant sequence are sufficiently divergent and the mixture of tree histories is not strongly skewed toward a single tree. delta Plots and taxon rankings are applied to three biological data sets using distances derived from sequence alignment, gene order, and fragment length polymorphism.

[1]  P. Buneman The Recovery of Trees from Measures of Dissimilarity , 1971 .

[2]  D. Kendall,et al.  Mathematics in the Archaeological and Historical Sciences , 1971, The Mathematical Gazette.

[3]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[4]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[5]  M. Eigen,et al.  Statistical geometry in sequence space: a method of quantitative comparative sequence analysis. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[6]  M. Eigen,et al.  Statistical geometry on sequence space. , 1990, Methods in enzymology.

[7]  A. Dress,et al.  Split decomposition: a new and useful approach to phylogenetic analysis of distance data. , 1992, Molecular phylogenetics and evolution.

[8]  D. Penny,et al.  Spectral analysis of phylogenetic data , 1993 .

[9]  F. Ayala,et al.  The yeast Candida albicans has a clonal mode of reproduction in a population of infected human immunodeficiency virus-positive patients. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[10]  T. G. Mitchell,et al.  Molecular markers reveal that population structure of the human pathogen Candida albicans exhibits both clonality and recombination. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[11]  James Lyons-Weiler,et al.  Relative apparent synapomorphy analysis (RASA). I: The statistical measurement of phylogenetic signal. , 1996, Molecular biology and evolution.

[12]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[13]  E. Holmes,et al.  A likelihood method for the detection of selection and recombination using nucleotide sequences. , 1997, Molecular biology and evolution.

[14]  M. Tibayrenc Are Candida albicans natural populations subdivided? , 1997, Trends in microbiology.

[15]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[16]  G. McGuire,et al.  A graphical method for detecting recombination in phylogenetic data sets. , 1997, Molecular biology and evolution.

[17]  K. Strimmer,et al.  Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[18]  G. Weiller Phylogenetic profiles: a graphical method for detecting genetic recombinations in homologous sequences. , 1998, Molecular biology and evolution.

[19]  Daniel H. Huson,et al.  SplitsTree: analyzing and visualizing evolutionary data , 1998, Bioinform..

[20]  E. Holmes,et al.  Phylogenetic evidence for recombination in dengue virus. , 1999, Molecular biology and evolution.

[21]  R. Cannon,et al.  Evidence for a general-purpose genotype in Candida albicans, highly prevalent in multiple geographical regions, patient types and types of infection. , 1999, Microbiology.

[22]  David Sankoff,et al.  Early eukaryote evolution based on mitochondrial gene order breakpoints , 2000, RECOMB '00.

[23]  K. Crandall,et al.  Intraspecific gene genealogies: trees grafting into networks. , 2001, Trends in ecology & evolution.

[24]  A. von Haeseler,et al.  Quartet-mapping, a generalization of the likelihood-mapping procedure. , 2001, Molecular biology and evolution.