Bacterial population assay via k-mer analysis

Identifying and assaying the relative abundance of members of complex microbial communities is an important problem in ecology. Sandberg et al. 11 investigated the usage of genomic signatures to provide high identification percentages from short sequenc e samples. In this paper we present an improved naive Bayesian classification method using condit ional probabilities, which can be used to classify unsequenced bacterial species, as well as identif y and predict the frequency of the dominant species in mixed microbial populations.

[1]  R. Sandberg,et al.  Capturing whole-genome characteristics in short sequences using a naïve Bayesian classifier. , 2001, Genome research.

[2]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[3]  S. Karlin,et al.  Comparative DNA analysis across diverse genomes. , 1998, Annual review of genetics.

[4]  J. Dunn,et al.  Genomic signature tags (GSTs): a system for profiling genomic DNA. , 2002, Genome research.

[5]  A. Protopopov,et al.  Restriction site tagged (RST) microarrays: a novel technique to study the species composition of complex microbial systems. , 2003, Nucleic acids research.

[6]  Ross A. Overbeek,et al.  The Ribosomal Database Project (RDP) , 1996, Nucleic Acids Res..

[7]  Christoforos Nikolaou,et al.  Mutually symmetric and complementary triplets: differences in their use distinguish systematically between coding and non-coding genomic sequences. , 2003, Journal of theoretical biology.

[8]  S Karlin,et al.  Similarities and dissimilarities of phage genomes. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[9]  S. Karlin,et al.  Global dinucleotide signatures and analysis of genomic heterogeneity. , 1998, Current opinion in microbiology.

[10]  P. Deschavanne,et al.  Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. , 1999, Molecular biology and evolution.

[11]  Gösta Winberg,et al.  NotI passporting to identify species composition of complex microbial systems. , 2003, Nucleic acids research.

[12]  Alain Giron,et al.  A genomic schism in birds revealed by phylogenetic analysis of DNA strings. , 2002, Systematic biology.

[13]  R. Jernigan,et al.  Pervasive properties of the genomic signature , 2002, BMC Genomics.