BMC Bioinformatics BioMed Central

BackgroundHorizontal gene transfer (HGT) has allowed bacteria to evolve many new capabilities. Because transferred genes perform many medically important functions, such as conferring antibiotic resistance, improved detection of horizontally transferred genes from sequence data would be an important advance. Existing sequence-based methods for detecting HGT focus on changes in nucleotide composition or on differences between gene and genome phylogenies; these methods have high error rates.ResultsFirst, we introduce a new class of methods for detecting HGT based on the changes in nucleotide substitution rates that occur when a gene is transferred to a new organism. Our new methods discriminate simulated HGT events with an error rate up to 10 times lower than does GC content. Use of models that are not time-reversible is crucial for detecting HGT. Second, we show that using combinations of multiple predictors of HGT offers substantial improvements over using any single predictor, yielding as much as a factor of 18 improvement in performance (a maximum reduction in error rate from 38% to about 3%). Multiple predictors were combined by using the random forests machine learning algorithm to identify optimal classifiers that separate HGT from non-HGT trees.ConclusionThe new class of HGT-detection methods introduced here combines advantages of phylogenetic and compositional HGT-detection techniques. These new techniques offer order-of-magnitude improvements over compositional methods because they are better able to discriminate HGT from non-HGT trees under a wide range of simulated conditions. We also found that combining multiple measures of HGT is essential for detecting a wide range of HGT events. These novel indicators of horizontal transfer will be widely useful in detecting HGT events linked to the evolution of important bacterial traits, such as antibiotic resistance and pathogenicity.

[1]  H. Vlamakis,et al.  Evidence for Extensive Resistance Gene Transfer amongBacteroides spp. and among Bacteroides and Other Genera in the Human Colon , 2001, Applied and Environmental Microbiology.

[2]  M Dröge,et al.  Horizontal gene transfer as a biosafety issue: a natural phenomenon of public concern. , 1998, Journal of biotechnology.

[3]  N. Sueoka,et al.  Asymmetric directional mutation pressures in bacteria , 2002, Genome Biology.

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  H. Ochman,et al.  Lateral gene transfer and the nature of bacterial innovation , 2000, Nature.

[6]  U. Gophna,et al.  Bacterial type III secretion systems are ancient and evolved by multiple horizontal-transfer events. , 2003, Gene.

[7]  W. J. Kent,et al.  Conservation, regulation, synteny, and introns in a large-scale C. briggsae-C. elegans genomic alignment. , 2000, Genome research.

[8]  Howard Ochman,et al.  Reconciling the many faces of lateral gene transfer. , 2002, Trends in microbiology.

[9]  D. Higgins,et al.  See Blockindiscussions, Blockinstats, Blockinand Blockinauthor Blockinprofiles Blockinfor Blockinthis Blockinpublication Clustal: Blockina Blockinpackage Blockinfor Blockinperforming Multiple Blockinsequence Blockinalignment Blockinon Blockina Minicomputer Article Blockin Blockinin Blockin , 2022 .

[10]  J. Shea,et al.  Identification of a virulence locus encoding a second type III secretion system in Salmonella typhimurium. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[11]  John Ignatius Griffin,et al.  Statistics; methods and applications , 1963 .

[12]  J. Andersson,et al.  Lateral gene transfer in eukaryotes , 2005, Cellular and Molecular Life Sciences CMLS.

[13]  Timothy J. Harlow,et al.  Do different surrogate methods detect lateral genetic transfer events of different relative ages? , 2006, Trends in microbiology.

[14]  S. Salzberg,et al.  Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima , 1999, Nature.

[15]  Sue A Hill,et al.  Chapter 18 – Statistics , 2006 .

[16]  Thomas W H Lui,et al.  Empirical models for substitution in ribosomal RNA. , 2003, Molecular biology and evolution.

[17]  Rob Knight,et al.  Do universal codon-usage patterns minimize the effects of mutation and translation error? , 2005, Genome Biology.

[18]  M. Kimura The Neutral Theory of Molecular Evolution: Introduction , 1983 .

[19]  L. Koski,et al.  Codon bias and base composition are poor indicators of horizontally transferred genes. , 2001, Molecular biology and evolution.

[20]  S. Rosenberg,et al.  Antibiotic-induced lateral transfer of antibiotic resistance. , 2004, Trends in microbiology.

[21]  Robert S. Barlow,et al.  Isolation and Characterization of Integron-Containing Bacteria without Antibiotic Selection , 2004, Antimicrobial Agents and Chemotherapy.

[22]  S Karlin,et al.  Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. , 2001, Trends in microbiology.

[23]  J. Lake,et al.  Phylogenetic inference: how much evolutionary history is knowable? , 1997, Molecular biology and evolution.

[24]  Vincent Daubin,et al.  Examining bacterial species under the specter of gene transfer and exchange , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  H. Ochman,et al.  Amelioration of Bacterial Genomes: Rates of Change and Exchange , 1997, Journal of Molecular Evolution.

[26]  S. Džidić,et al.  Horizontal gene transfer-emerging multidrug resistance in hospital bacteria. , 2003, Acta pharmacologica Sinica.

[27]  H. Matsuda,et al.  Biased biological functions of horizontally transferred genes in prokaryotic genomes , 2004, Nature Genetics.

[28]  S. Garcia-Vallvé,et al.  Horizontal gene transfer in bacterial and archaeal complete genomes. , 2000, Genome research.

[29]  G. Salmond,et al.  Membrane traffic wardens and protein secretion in gram-negative bacteria. , 1993, Trends in biochemical sciences.

[30]  P. Bork,et al.  Variation and evolution of the citric-acid cycle: a genomic perspective. , 1999, Trends in microbiology.

[31]  G. Serio,et al.  A new method for calculating evolutionary substitution rates , 2005, Journal of Molecular Evolution.

[32]  J. Oliver,et al.  The general stochastic model of nucleotide substitution. , 1990, Journal of theoretical biology.

[33]  Brian Everitt,et al.  Principles of Multivariate Analysis , 2001 .

[34]  Stephen J Freeland,et al.  A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes , 2001, Genome Biology.

[35]  N. Sueoka,et al.  Compositional correlation between deoxyribonucleic acid and protein. , 1961, Cold Spring Harbor symposia on quantitative biology.

[36]  P. Hraber,et al.  Global similarities in nucleotide base composition among disparate functional classes of single-stranded RNA imply adaptive evolutionary convergence. , 1996, RNA.

[37]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[38]  J. Kingman The imbedding problem for finite Markov chains , 1962 .

[39]  M. Kimura,et al.  The neutral theory of molecular evolution. , 1983, Scientific American.

[40]  Junhyong Kim,et al.  The Cobweb of Life Revealed by Genome-Scale Estimates of Horizontal Gene Transfer , 2005, PLoS biology.

[41]  Korine S. E. Ung,et al.  Evidence of a Large Novel Gene Pool Associated with Prokaryotic Genomic Islands , 2005, PLoS genetics.

[42]  Noboru Sueoka,et al.  Wide intra-genomic G+C heterogeneity in human and chicken is mainly due to strand-symmetric directional mutation pressures: dGTP-oxidation and symmetric cytosine-deamination hypotheses. , 2002, Gene.

[43]  David Penny,et al.  Estimating Changes in Mutational Mechanisms of Evolution , 2003, Journal of Molecular Evolution.

[44]  J. Zhou,et al.  Horizontal transfer of multiple penicillin‐binding protein genes, and capsular biosynthetic genes, in natural populations of Streptococcus pneumoniae , 1991, Molecular microbiology.

[45]  Frederic D. Bushman,et al.  Lateral DNA transfer , 2001 .

[46]  C. Yanofsky,et al.  Altered base ratios in the DNA of an Escherichia coli mutator strain. , 1967, Proceedings of the National Academy of Sciences of the United States of America.

[47]  N. Sueoka Directional mutation pressure and neutral molecular evolution. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Philip J. Reeves,et al.  Membrance traffic wardens and protein secretion in Gram-negative bacteria , 1993 .

[49]  Jacqueline A. Servin,et al.  Decoding the genomic tree of life , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[50]  C. Lobry,et al.  Evolution of DNA base composition under no-strand-bias conditions when the substitution rates are not constant. , 1999, Molecular biology and evolution.

[51]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[52]  Rob Knight,et al.  Natural selection is not required to explain universal compositional patterns in rRNA secondary structure categories. , 2006, RNA.

[53]  B. Snel,et al.  Genomes in flux: the evolution of archaeal and proteobacterial gene content. , 2002, Genome research.

[54]  Gary J. Olsen,et al.  Aminoacyl-tRNA Synthetases, the Genetic Code, and the Evolutionary Process , 2000, Microbiology and Molecular Biology Reviews.

[55]  Arndt von Haeseler,et al.  Testing substitution models within a phylogenetic tree. , 2003, Molecular biology and evolution.

[56]  B. Finlay,et al.  Locus of Enterocyte Effacement from Citrobacter rodentium: Sequence Analysis and Evidence for Horizontal Transfer among Attaching and Effacing Pathogens , 2001, Infection and Immunity.

[57]  W. Doolittle,et al.  Phylogenetic analyses of two "archaeal" genes in thermotoga maritima reveal multiple transfers between archaea and bacteria. , 2001, Molecular biology and evolution.

[58]  N. Sueoka On the genetic basis of variation and heterogeneity of DNA base composition. , 1962, Proceedings of the National Academy of Sciences of the United States of America.

[59]  H. Ochman,et al.  Molecular, functional, and evolutionary analysis of sequences specific to Salmonella. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[60]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[61]  P. Lio’,et al.  Models of molecular evolution and phylogeny. , 1998, Genome research.

[62]  Derrick W. Crook,et al.  Transferable Antibiotic Resistance Elements in Haemophilus influenzae Share a Common Evolutionary Origin with a Diverse Family of Syntenic Genomic Islands , 2004, Journal of bacteriology.

[63]  M. Ragan On surrogate methods for detecting lateral gene transfer. , 2001, FEMS microbiology letters.

[64]  N. Takahata Neutral theory of molecular evolution. , 1996, Current opinion in genetics & development.

[65]  C. Kurland,et al.  Horizontal gene transfer: A critical view , 2003 .

[66]  Ren Zhang,et al.  Identification of genomic islands in the genome of Bacillus cereus by comparative analysis with Bacillus anthracis. , 2003, Physiological genomics.

[67]  S. Karlin,et al.  Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[68]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[69]  D. Vere-Jones Markov Chains , 1972, Nature.

[70]  M. Borodovsky,et al.  How to interpret an anonymous bacterial genome: machine learning approach to gene identification. , 1998, Genome research.

[71]  J. DiRuggiero,et al.  Evidence of recent lateral gene transfer among hyperthermophilic Archaea , 2000, Molecular microbiology.

[72]  H. Ochman,et al.  Lateral and oblique gene transfer. , 2001, Current opinion in genetics & development.

[73]  S. Osawa,et al.  The guanine and cytosine content of genomic DNA and bacterial evolution. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[74]  M. Syvanen Horizontal gene transfer: evidence and possible consequences. , 1994, Annual review of genetics.

[75]  Toshimichi Ikemura,et al.  Codon usage tabulated from international DNA sequence databases: status for the year 2000 , 2000, Nucleic Acids Res..

[76]  Noboru Sueoka,et al.  Intrastrand parity rules of DNA base composition and usage biases of synonymous codons , 2005, Journal of Molecular Evolution.

[77]  J. Lobry,et al.  A simple vectorial representation of DNA sequences for the detection of replication origins in bacteria. , 1996, Biochimie.

[78]  H. Ochman,et al.  Identification of a pathogenicity island required for Salmonella survival in host cells. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[79]  L. Katz,et al.  Lateral gene transfers and the evolution of eukaryotes: theories and data. , 2002, International journal of systematic and evolutionary microbiology.

[80]  T. Jukes,et al.  The neutral theory of molecular evolution. , 2000, Genetics.

[81]  M. Kimura Evolutionary Rate at the Molecular Level , 1968, Nature.

[82]  R. A. Van Den Bussche,et al.  Unusual pattern of bacterial ice nucleation gene evolution. , 1994, Molecular biology and evolution.

[83]  Bruno Torrésani,et al.  Rate Matrices for Analyzing Large Families of Protein Sequences , 2002, J. Comput. Biol..

[84]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[85]  Søren Johansen,et al.  The Imbedding Problem for Finite Markov Chains , 1973 .

[86]  Eugene V Koonin,et al.  Horizontal gene transfer: the path to maturity , 2003, Molecular microbiology.