The branching order and phylogenetic placement of species from completed bacterial genomes, based on conserved indels found in various proteins

Abstract. The presence of shared conserved inserts and deletions (indels or signature sequences) in proteins provides a powerful means for understanding the evolutionary relationships among the Bacteria. Using such indels, all of the main groups within the Bacteria can be defined in clear molecular terms and it has become possible to deduce that they branched from a common ancestor in the following order: Low G+C Gram-positive → High G+C Gram-positive → Deinococcus–Thermus → Cyanobacteria → Spirochetes → Aquifex–Chlamydia–Cytophaga → Proteobacteria-1 (ε, δ) → Proteobacteria-2 (α)→ Proteobacteria-3 (β) → Proteobacteria -4 (γ). The usefulness of this approach for understanding bacterial phylogeny was examined here using sequence data from various completed bacterial genomes. By using 12 indels in highly conserved and widely represented proteins, the species from all 41 completed bacterial genomes were assigned to different groups; and the observed distribution of these indels in different species was then compared with that predicted by the signature sequence model. The presence or absence of these indels in various proteins in different bacteria followed the pattern exactly as predicted; and, in more than 450 observations, no exceptions or contradictions in the placement of indels were observed. These results provide strong evidence that lateral gene transfer events have not affected the genes containing these indels to any significant extent. The phylogenetic placement of bacteria into different groups based on signature sequences also showed an excellent correlation with the 16 S rRNA with 39 of the 41 species assigned to the same group by both methods. These results strongly vindicate the usefulness of the signature sequence approach to understanding phylogeny within the Bacteria and show that it provides a reliable and internally consistent means for the placement of bacterial species into different groups and for determining the relative branching order of the groups.

[1]  K. Kurokawa,et al.  Complete nucleotide sequence of the prophage VT2-Sakai carrying the verotoxin 2 genes of the enterohemorrhagic Escherichia coli O157:H7 derived from the Sakai outbreak. , 1999, Genes & genetic systems.

[2]  Hans-Peter Klenk,et al.  Overview: A Phylogenetic Backbone and Taxonomic Framework for Procaryotic Systematics , 2015 .

[3]  B. Seaton,et al.  A gene encoding a DnaK/hsp70 homolog in Escherichia coli. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[4]  T. Mukhtar,et al.  Evolutionary relationships among photosynthetic prokaryotes (Heliobacterium chlorum, Chloroflexus aurantiacus, cyanobacteria, Chlorobium tepidum and proteobacteria): implications regarding the origin of photosynthesis , 1999, Molecular microbiology.

[5]  Sayaka,et al.  Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. , 1996, DNA research : an international journal for rapid publication of reports on genes and genomes.

[6]  S. Salzberg,et al.  Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. , 1999, Science.

[7]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[8]  M. Hattori,et al.  Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS , 2000, Nature.

[9]  E. Rocha,et al.  The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. , 2001, Nucleic acids research.

[10]  B. Barrell,et al.  Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence , 1998, Nature.

[11]  M. Hattori,et al.  Comparison of whole genome sequences of Chlamydia pneumoniae J138 from Japan and CWL029 from USA. , 2000, Nucleic acids research.

[12]  P. Dennis,et al.  RNA Polymerase of Aquifex pyrophilus: Implications for the Evolution of the Bacterial rpoBC Operon and Extremely Thermophilic Bacteria , 1999, Journal of Molecular Evolution.

[13]  D. A. Palmieri,et al.  The genome sequence of the plant pathogen Xylella fastidiosa , 2000, Nature.

[14]  L. Orgel,et al.  Phylogenetic Classification and the Universal Tree , 1999 .

[15]  Doolittle Wf Phylogenetic Classification and the Universal Tree , 1999 .

[16]  Y. Nakamura,et al.  Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions (supplement). , 1996, DNA research : an international journal for rapid publication of reports on genes and genomes.

[17]  T. Gruber,et al.  Characterization of the group 1 and group 2 sigma factors of the green sulfur bacterium Chlorobium tepidum and the green non-sulfur bacterium Chloroflexus aurantiacus , 1998, Archives of Microbiology.

[18]  Radhey S. Gupta What are archaebacteria: life's third domain or monoderm prokaryotes related to Gram‐positive bacteria? A new proposal for the classification of prokaryotic organisms , 1998, Molecular microbiology.

[19]  A. Goffeau,et al.  The complete genome sequence of the Gram-positive bacterium Bacillus subtilis , 1997, Nature.

[20]  T. Sicheritz-Pontén,et al.  The genome sequence of Rickettsia prowazekii and the origin of mitochondria , 1998, Nature.

[21]  S. Mason Beginnings of cellular life: Metabolism recapitulates biogenesis , 1993 .

[22]  T. Macke,et al.  A phylogenetic definition of the major eubacterial taxa. , 1985, Systematic and applied microbiology.

[23]  R. Gupta,et al.  The phylogeny of proteobacteria: relationships to other eubacterial phyla and eukaryotes. , 2000, FEMS microbiology reviews.

[24]  J L Risler,et al.  Phylogeny of related functions: the case of polyamine biosynthetic enzymes. , 2000, Microbiology.

[25]  E V Koonin,et al.  Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. , 1998, Trends in genetics : TIG.

[26]  Ronald W. Davis,et al.  Comparative genomes of Chlamydia pneumoniae and C. trachomatis , 1999, Nature Genetics.

[27]  Radhey S. Gupta,et al.  The Natural Evolutionary Relationships among Prokaryotes , 2000, Critical reviews in microbiology.

[28]  H. Morowitz Beginnings of Cellular Life: Metabolism Recapitulates Biogenesis , 1992 .

[29]  R. Huber,et al.  The complete genome of the hyperthermophilic bacterium Aquifex aeolicus , 1998, Nature.

[30]  S. Salzberg,et al.  Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39. , 2000, Nucleic acids research.

[31]  B L Maidak,et al.  The RDP-II (Ribosomal Database Project) , 2001, Nucleic Acids Res..

[32]  Radhey S. Gupta Protein Phylogenies and Signature Sequences: A Reappraisal of Evolutionary Relationships among Archaebacteria, Eubacteria, and Eukaryotes , 1998, Microbiology and Molecular Biology Reviews.

[33]  Bruce A. Roe,et al.  Complete genome sequence of an M1 strain of Streptococcus pyogenes , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[34]  R. Overbeek,et al.  The winds of (evolutionary) change: breathing new life into microbiology. , 1996, Journal of bacteriology.

[35]  M. Kanehisa,et al.  Whole genome sequencing of meticillin-resistant Staphylococcus aureus , 2001, The Lancet.

[36]  C. Woese The universal ancestor. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Ian T. Paulsen,et al.  Complete genome sequence of Caulobacter crescentus , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Y. Nakamura,et al.  Complete genome structure of the nitrogen-fixing symbiotic bacterium Mesorhizobium loti. , 2000, DNA research : an international journal for rapid publication of reports on genes and genomes.

[39]  R. Fleischmann,et al.  The Minimal Gene Complement of Mycoplasma genitalium , 1995, Science.

[40]  S. Salzberg,et al.  Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. , 2000, Science.

[41]  Mark Borodovsky,et al.  The complete genome sequence of the gastric pathogen Helicobacter pylori , 1997, Nature.

[42]  B. Barrell,et al.  Massive gene decay in the leprosy bacillus , 2001, Nature.

[43]  S. Osawa,et al.  Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[44]  J A Lake,et al.  Evidence that eukaryotes and eocyte prokaryotes are immediate relatives. , 1992, Science.

[45]  H. Hilbert,et al.  Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. , 1996, Nucleic acids research.

[46]  B. Barrell,et al.  Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491 , 2000, Nature.

[47]  Benjamin L. King,et al.  Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori , 1999, Nature.

[48]  T. Kawula,et al.  Mutations in a gene encoding a new Hsp70 suppress rapid DNA inversion and bgl activation, but not proU derepression, in hns-1 mutant Escherichia coli , 1994, Journal of bacteriology.

[49]  S. Salzberg,et al.  Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima , 1999, Nature.

[50]  S. Salzberg,et al.  Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi , 1997, Nature.

[51]  S. Lory,et al.  Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen , 2000, Nature.

[52]  J A Lake,et al.  The order of sequence alignment can bias the selection of tree topology. , 1991, Molecular biology and evolution.

[53]  B. Barrell,et al.  The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences , 2000, Nature.

[54]  James R. Brown,et al.  Archaea and the prokaryote-to-eukaryote transition. , 1997, Microbiology and molecular biology reviews : MMBR.

[55]  R. W. Davis,et al.  Genome sequence of an obligate intracellular pathogen of humans: Chlamydia trachomatis. , 1998, Science.

[56]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[57]  R. Gupta,et al.  Cloning of the HSP70 gene from Halobacterium marismortui: relatedness of archaebacterial HSP70 to its eubacterial homologs and a model for the evolution of the HSP70 gene , 1992, Journal of bacteriology.

[58]  S. Salzberg,et al.  Complete genome sequence of Treponema pallidum, the syphilis spirochete. , 1998, Science.

[59]  Y. Nakamura,et al.  Complete genome sequence of the alkaliphilic bacterium Bacillus halodurans and genomic sequence comparison with Bacillus subtilis. , 2000, Nucleic acids research.

[60]  J. Lake,et al.  Horizontal gene transfer among genomes: the complexity hypothesis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[61]  H. Ochman,et al.  Molecular archaeology of the Escherichia coli genome. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[62]  C. Woese,et al.  Bacterial evolution , 1987, Microbiological reviews.

[63]  N. W. Davis,et al.  Genome sequence of enterohaemorrhagic Escherichia coli O157:H7 , 2001, Nature.

[64]  Vivek Kapur,et al.  Complete genomic sequence of Pasteurella multocida,Pm70 , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[65]  S. Salzberg,et al.  DNA sequence of both chromosomes of the cholera pathogen Vibrio cholerae , 2000, Nature.