Microbial Contamination in Next Generation Sequencing: Implications for Sequence-Based Analysis of Clinical Samples

The high level of accuracy and sensitivity of next generation sequencing for quantifying genetic material across organismal boundaries gives it tremendous potential for pathogen discovery and diagnosis in human disease. Despite this promise, substantial bacterial contamination is routinely found in existing human-derived RNA-seq datasets that likely arises from environmental sources. This raises the need for stringent sequencing and analysis protocols for studies investigating sequence-based microbial signatures in clinical samples.

[1]  Joseph Coco,et al.  Detection of Murine Leukemia Virus in the Epstein-Barr Virus-Positive Human B-Cell Line JY, Using a Computational RNA-Seq-Based Exogenous Agent Detection Pipeline, PARSES , 2012, Journal of Virology.

[2]  David A. Nix,et al.  Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks , 2008, BMC Bioinformatics.

[3]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of squamous cell lung cancers , 2012, Nature.

[4]  M. Meyerson,et al.  Sequence-based discovery of Bradyrhizobium enterica in cord colitis syndrome. , 2013, The New England journal of medicine.

[5]  H. Nakazato,et al.  Polyadenylic acid sequences in E. coli messenger RNA , 1975, Nature.

[6]  R. Percudani A Microbial Metagenome (Leucobacter sp.) in Caenorhabditis Whole Genome Sequences , 2013, Bioinformatics and biology insights.

[7]  S. Schuster,et al.  Integrative analysis of environmental sequences using MEGAN4. , 2011, Genome research.

[8]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[9]  E. Delwart,et al.  Concerns over the origin of NIH-CQV, a novel virus discovered in Chinese patients with seronegative hepatitis , 2014, Proceedings of the National Academy of Sciences.

[10]  K. Zhao,et al.  Hybrid DNA virus in Chinese patients with seronegative hepatitis discovered by deep sequencing , 2013, Proceedings of the National Academy of Sciences.

[11]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of human colon and rectal cancer , 2012, Nature.

[12]  John Hackett,et al.  The Perils of Pathogen Discovery: Origin of a Novel Parvovirus-Like Hybrid Genome Traced to Nucleic Acid Extraction Spin Columns , 2013, Journal of Virology.

[13]  B. Birren,et al.  Genomic analysis identifies association of Fusobacterium with colorectal carcinoma. , 2012, Genome research.

[14]  Ole Lund,et al.  Rapid Whole-Genome Sequencing for Detection and Characterization of Microorganisms Directly from Clinical Samples , 2013, Journal of Clinical Microbiology.

[15]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[16]  N. Ohta,et al.  Poly(adenylic acid) sequences in the RNA of Caulobacter crescenus. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Christopher M. Taylor,et al.  Differences in Gastric Carcinoma Microenvironment Stratify According to EBV Infection Intensity: Implications for Possible Immune Adjuvant Therapy , 2013, PLoS pathogens.

[18]  Carl Baribault,et al.  RNA CoMPASS: A Dual Approach for Pathogen and Host Transcriptome Analysis of RNA-Seq Datasets , 2014, PloS one.

[19]  B. Thiers,et al.  Clonal Integration of a Polyomavirus in Human Merkel Cell Carcinoma , 2009 .

[20]  Joseph L DeRisi,et al.  Actionable diagnosis of neuroleptospirosis by next-generation sequencing. , 2014, The New England journal of medicine.

[21]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[22]  N. Loman,et al.  A culture-independent sequence-based metagenomics approach to the investigation of an outbreak of Shiga-toxigenic Escherichia coli O104:H4. , 2013, JAMA.

[23]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[24]  N. Sarkar Polyadenylation of mRNA in bacteria. , 1996, Microbiology.

[25]  H. Smuts,et al.  Novel Hybrid Parvovirus-Like Virus, NIH-CQV/PHV, Contaminants in Silica Column-Based Nucleic Acid Extraction Kits , 2013, Journal of Virology.

[26]  Richard A. Moore,et al.  Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. , 2012, Genome research.

[27]  Jeroen F. J. Laros,et al.  Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories , 2013, Nature Biotechnology.

[28]  G. Dougan,et al.  Routine Use of Microbial Whole Genome Sequencing in Diagnostic and Public Health Microbiology , 2012, PLoS pathogens.

[29]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[30]  Douglas E. Brash,et al.  Common Contaminants in Next-Generation Sequencing That Hinder Discovery of Low-Abundance Microbes , 2014, PloS one.

[31]  David A. Rasko,et al.  Bacterial genome sequencing in the clinic: bioinformatic challenges and solutions , 2013, Nature Reviews Genetics.

[32]  P. R. Srinivasan,et al.  Presence of polyriboadenylate sequences in pulse-labeled RNA of Escherichia coli. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Tina O'Grady,et al.  Epstein-Barr Virus and Human Herpesvirus 6 Detection in a Non-Hodgkin's Diffuse Large B-Cell Lymphoma Cohort by Using RNA Sequencing , 2013, Journal of Virology.