Bayesian community-wide culture-independent microbial source tracking

Contamination is a critical issue in high-throughput metagenomic studies, yet progress toward a comprehensive solution has been limited. We present SourceTracker, a Bayesian approach to estimate the proportion of contaminants in a given community that come from possible source environments. We applied SourceTracker to microbial surveys from neonatal intensive care units (NICUs), offices and molecular biology laboratories, and provide a database of known contaminants for future testing.

[1]  Norman R. Pace,et al.  Specific Ribosomal DNA Sequences from Diverse Environmental Settings Correlate with Experimental Contaminants , 1998, Applied and Environmental Microbiology.

[2]  R. Knight,et al.  Rapid denoising of pyrosequencing amplicon data: exploiting the rank-abundance distribution , 2010, Nature Methods.

[3]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[4]  Blair Sterba-Boatwright,et al.  Novel application of a statistical technique, Random Forests, in a bacterial source tracking study. , 2010, Water research.

[5]  F. Bushman,et al.  Sampling and pyrosequencing methods for characterizing bacterial communities in the human gut using 16S sequence tags , 2010, BMC Microbiology.

[6]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Adam P. Arkin,et al.  FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix , 2009, Molecular biology and evolution.

[9]  R. Knight,et al.  The influence of sex, handedness, and washing on the diversity of hand surface bacteria , 2008, Proceedings of the National Academy of Sciences.

[10]  R. Knight,et al.  Bacterial Community Variation in Human Body Habitats Across Space and Time , 2009, Science.

[11]  Eoin L. Brodie,et al.  Characterization of Coastal Urban Watershed Bacterial Communities Leads to Alternative Community-Based Indicators , 2010, PloS one.

[12]  B. Haas,et al.  Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. , 2011, Genome research.

[13]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  P. Legendre,et al.  SPECIES ASSEMBLAGES AND INDICATOR SPECIES:THE NEED FOR A FLEXIBLE ASYMMETRICAL APPROACH , 1997 .

[15]  Bertram Price,et al.  Alternative estimate of source distribution in microbial source tracking using posterior probabilities. , 2010, Water research.

[16]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[17]  C. Quince,et al.  Accurate determination of microbial diversity from 454 pyrosequencing data , 2009, Nature Methods.

[18]  Vanja Klepac-Ceraj,et al.  PCR-Induced Sequence Artifacts and Bias: Insights from Comparison of Two 16S rRNA Clone Libraries Constructed from the Same Sample , 2005, Applied and Environmental Microbiology.

[19]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[20]  R. Knight,et al.  Pyrosequencing-Based Assessment of Soil pH as a Predictor of Soil Bacterial Community Structure at the Continental Scale , 2009, Applied and Environmental Microbiology.

[21]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[22]  J. M. Simpson,et al.  Microbial source tracking: state of the science. , 2002, Environmental science & technology.