Minimum Contradiction Matrices in Whole Genome Phylogenies

Minimum contradiction matrices are a useful complement to distance-based phylogenies. A minimum contradiction matrix represents phylogenetic information under the form of an ordered distance matrix Y i , j n . A matrix element corresponds to the distance from a reference vertex n to the path (i, j). For an X-tree or a split network, the minimum contradiction matrix is a Robinson matrix. It therefore fulfills all the inequalities defining perfect order: Y i , j n ≥ Y i , k n , Y k , j n ≥ Y k , i n , i ≤ j ≤ k < n. In real phylogenetic data, some taxa may contradict the inequalities for perfect order. Contradictions to perfect order correspond to deviations from a tree or from a split network topology. Efficient algorithms that search for the best order are presented and tested on whole genome phylogenies with 184 taxa including many Bacteria, Archaea and Eukaryota. After optimization, taxa are classified in their correct domain and phyla. Several significant deviations from perfect order correspond to well-documented evolutionary events.

[1]  Gerhard J. Woeginger,et al.  Sometimes Travelling is Easy: The Master Tour Problem , 1998, SIAM J. Discret. Math..

[2]  R. Doolittle,et al.  Phylogeny determined by protein domain content. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Daniel H. Huson,et al.  SplitsTree-a program for analyzing and visualizing evolutionary data , 1997 .

[4]  A. Dress,et al.  Split decomposition: a new and useful approach to phylogenetic analysis of distance data. , 1992, Molecular phylogenetics and evolution.

[5]  Michael Y. Galperin,et al.  New metrics for comparative genomics. , 2006, Current opinion in biotechnology.

[6]  Marc Thuillard Minimizing Contradictions on Circular Order of Phylogenic Trees , 2007, Evolutionary bioinformatics online.

[7]  Vladimir Makarenkov,et al.  Comparison of Additive Trees Using Circular Orders , 2000, J. Comput. Biol..

[8]  W. S. Robinson A Method for Chronologically Ordering Archaeological Deposits , 1951, American Antiquity.

[9]  B. Snel,et al.  SHOT: a web server for the construction of genome phylogenies. , 2002, Trends in genetics : TIG.

[10]  B. Snel,et al.  Genome phylogeny based on gene content , 1999, Nature Genetics.

[11]  Michael A. Trick,et al.  The Structure of Circular Decomposable Metrics , 1996, ESA.

[12]  D. Kendall,et al.  Mathematics in the Archaeological and Historical Sciences , 1971, The Mathematical Gazette.

[13]  Leon Goldovsky,et al.  The net of life: reconstructing the microbial phylogenetic network. , 2005, Genome research.

[14]  Alain Guénoche,et al.  Trees and proximity representations , 1991, Wiley-Interscience series in discrete mathematics and optimization.

[15]  K. Nishikawa,et al.  A tree of life based on protein domain organizations. , 2007, Molecular biology and evolution.

[16]  Bas E. Dutilh,et al.  Assessment of phylogenomic and orthology approaches for phylogenetic inference , 2007, Bioinform..

[17]  Kevin Atteson,et al.  The Performance of Neighbor-Joining Methods of Phylogenetic Reconstruction , 1999, Algorithmica.

[18]  Kristoffer Forslund,et al.  QNet: an agglomerative method for the construction of phylogenetic networks from weighted quartets. , 2006, Molecular biology and evolution.

[19]  W. Martin,et al.  Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes , 2004, Nature Reviews Genetics.

[20]  Alexander Kraskov,et al.  Published under the scientific responsability of the EUROPEAN PHYSICAL SOCIETY Incorporating , 2002 .

[21]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[22]  Daniel H. Huson,et al.  Whole-genome prokaryotic phylogeny , 2005, Bioinform..

[23]  Daniel H. Huson,et al.  SplitsTree: analyzing and visualizing evolutionary data , 1998, Bioinform..

[24]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[25]  Vladimir Makarenkov,et al.  Circular orders of tree metrics, and their uses for the reconstruction and fitting of phylogenetic trees , 1996, Mathematical Hierarchies and Biology.

[26]  P. Buneman The Recovery of Trees from Measures of Dissimilarity , 1971 .

[27]  N. Grishin,et al.  Genome trees and the tree of life. , 2002, Trends in genetics : TIG.

[28]  Gaston H. Gonnet,et al.  Using traveling salesman problem algorithms for evolutionary tree construction , 2000, Bioinform..

[29]  S. Karlin,et al.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Marc Thuillard Adaptive multiresolution search: How to beat brute force? , 2004, Int. J. Approx. Reason..

[31]  Tandy J. Warnow,et al.  Reconstructing reticulate evolution in species: theory and practice , 2004, RECOMB.

[32]  Vladimir Makarenkov,et al.  Phylogenetic Network Construction Approaches , 2006 .

[33]  Kenneth Kalmanson Edgeconvex Circuits and the Traveling Salesman Problem , 1975, Canadian Journal of Mathematics.

[34]  S. Fitz-Gibbon,et al.  Whole genome-based phylogenetic analysis of free-living microorganisms. , 1999, Nucleic acids research.

[35]  Marc Thuillard Wavelets in Soft Computing , 2001, World Scientific Series in Robotics and Intelligent Systems.

[36]  V. Moulton,et al.  Neighbor-net: an agglomerative method for the construction of phylogenetic networks. , 2002, Molecular biology and evolution.

[37]  M. Ragan,et al.  Inferring Genome Trees by Using a Filter To Eliminate Phylogenetically Discordant Sequences and a Distance Matrix Based on Mean Normalized BLASTP Scores , 2002, Journal of bacteriology.

[38]  Y. Pauplin Direct Calculation of a Tree Length Using a Distance Matrix , 2000, Journal of Molecular Evolution.

[39]  M. Gerstein,et al.  Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels. , 2000, Genome research.

[40]  Christos A. Ouzounis,et al.  Measuring genome conservation across taxa: divided strains and united kingdoms , 2005, Nucleic acids research.

[41]  Tandy J. Warnow,et al.  Distance-Based Genome Rearrangement Phylogeny , 2006, Journal of Molecular Evolution.

[42]  O. Gascuel,et al.  Neighbor-joining revealed. , 2006, Molecular biology and evolution.