Efficient inference of bacterial strain trees from genome-scale multilocus data

Motivation: In bacterial evolution, inferring a strain tree, which is the evolutionary history of different strains of the same bacterium, plays a major role in analyzing and understanding the evolution of strongly isolated populations, population divergence and various evolutionary events, such as horizontal gene transfer and homologous recombination. Inferring a strain tree from multilocus data of these strains is exceptionally hard since, at this scale of evolution, processes such as homologous recombination result in a very high degree of gene tree incongruence. Results: In this article we present a novel computational method for inferring the strain tree despite massive gene tree incongruence caused by homologous recombination. Our method operates in three phases, where in phase I a set of candidate strain-tree topologies is computed using the maximal cliques concept, in phase II divergence times for each of the topologies are estimated using mixed integer linear programming (MILP) and in phase III the optimal tree (or trees) is selected based on an optimality criterion. We have analyzed 1898 genes from nine strains of the Staphylococcus aureus bacteria, and identified a fully resolved (binary) strain tree with estimated divergence times, despite the high degrees of sequence identity at the nucleotide level and gene tree incongruence. Our method's efficiency makes it particularly suitable for analysis of genome-scale datasets, including those of strongly isolated populations which are usually very challenging to analyze. Availability: We have implemented the algorithms in the PhyloNet software package, which is available publicly at http://bioinfo.cs.rice.edu/phylonet/ Contact: nakhleh@cs.rice.edu

[1]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[2]  A. Stoltzfus,et al.  Molecular evolution of the Escherichia coli chromosome. II. Clonal segments. , 1988, Genetics.

[3]  D. Pearl,et al.  High-resolution species trees without concatenation , 2007, Proceedings of the National Academy of Sciences.

[4]  A. Stoltzfus,et al.  Molecular evolution of the Escherichia coli chromosome. I. Analysis of structure and natural variation in a previously uncharacterized region between trp and tonB. , 1988, Genetics.

[5]  H. Ochman,et al.  Evolution in bacteria: Evidence for a universal substitution rate in cellular genomes , 1987, Journal of Molecular Evolution.

[6]  Samuel V. Angiuoli,et al.  Insights on Evolution of Virulence and Resistance from the Complete Genome Analysis of an Early Methicillin-Resistant Staphylococcus aureus Strain and a Biofilm-Producing Methicillin-Resistant Staphylococcus epidermidis Strain , 2005, Journal of bacteriology.

[7]  Y. Nagai,et al.  Genome and virulence determinants of high virulence community-acquired MRSA , 2002, The Lancet.

[8]  Jonathan Bath,et al.  DNA transport in bacteria , 2001, Nature Reviews Molecular Cell Biology.

[9]  B. Barrell,et al.  Complete genomes of two clinical Staphylococcus aureus strains: evidence for the rapid evolution of virulence and drug resistance. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.

[11]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[12]  James M. Musser,et al.  Molecular Correlates of Host Specialization in Staphylococcus aureus , 2007, PloS one.

[13]  A. Oskooi Molecular Evolution and Phylogenetics , 2008 .

[14]  M. Kanehisa,et al.  Whole genome sequencing of meticillin-resistant Staphylococcus aureus , 2001, The Lancet.

[15]  W. Maddison,et al.  Inferring phylogeny despite incomplete lineage sorting. , 2006, Systematic biology.

[16]  L. Kubatko,et al.  Inconsistency of phylogenetic estimates from concatenated data under coalescence. , 2007, Systematic biology.

[17]  G. Sensabaugh,et al.  Complete genome sequence of USA300, an epidemic clone of community-acquired meticillin-resistant Staphylococcus aureus , 2006, The Lancet.

[18]  S. Ferriera,et al.  Supporting Online Material Materials and Methods Figs. S1 and S2 Tables S1 and S2 References Temporal Fragmentation of Speciation in Bacteria , 2022 .

[19]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.