Metagenomic unmapped reads provide important insights into human microbiota and disease associations

We developed a computational pipeline, MicroPro, for metagenomic data analyses that take into account all the reads from known and unknown microbial organisms and for associating viruses with complex diseases. We utilized MicroPro to analyze metagenomics data related to three diseases: colorectal cancer, type-2 diabetes and liver cirrhosis, and showed that including reads from unknown organisms will markedly increase the prediction accuracy of the disease status based on metagenomics data. We identified new microbial organisms associated with these diseases. Viruses were shown to play important roles in colorectal cancer and liver cirrhosis, but not in type-2 diabetes. MicroPro is available at https://github.com/zifanzhu/MicroPro.

[1]  Jun Yu,et al.  Alterations in Enteric Virome Are Associated With Colorectal Cancer and Survival Outcomes. , 2018, Gastroenterology.

[2]  Laurence Zitvogel,et al.  The microbiome in cancer immunotherapy: Diagnostic tools and therapeutic strategies , 2018, Science.

[3]  Alice C. McHardy,et al.  AMBER: Assessment of Metagenome BinnERs , 2017, bioRxiv.

[4]  F. Scannapieco,et al.  Exploring the salivary microbiome of children stratified by the oral hygiene index , 2017, PloS one.

[5]  Jun Yu,et al.  Mucosal microbiome dysbiosis in gastric carcinogenesis , 2017, Gut.

[6]  Yang Young Lu,et al.  VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data , 2017, Microbiome.

[7]  Paolo Manghi,et al.  Accessible, curated metagenomic data through ExperimentHub , 2017, Nature Methods.

[8]  S. Salzberg,et al.  Centrifuge: rapid and sensitive classification of metagenomic sequences , 2016, bioRxiv.

[9]  Lanjuan Li,et al.  Dysbiosis of small intestinal microbiota in liver cirrhosis and its association with etiology , 2016, Scientific Reports.

[10]  Shaili Gupta,et al.  Parvimonas micra: A rare cause of native joint septic arthritis. , 2016, Anaerobe.

[11]  N. Banaei,et al.  First case of infectious endocarditis caused by Parvimonas micra. , 2015, Anaerobe.

[12]  Duy Tin Truong,et al.  MetaPhlAn2 for enhanced metagenomic taxonomic profiling , 2015, Nature Methods.

[13]  Dongwan D. Kang,et al.  MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities , 2015, PeerJ.

[14]  Jenny Sauk,et al.  Disease-Specific Alterations in the Enteric Virome in Inflammatory Bowel Disease , 2015, Cell.

[15]  Kunihiko Sadakane,et al.  MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..

[16]  Jens Roat Kultima,et al.  Potential of fecal microbiota for early‐stage detection of colorectal cancer , 2014 .

[17]  Beiwen Zheng,et al.  Alterations of the human gut microbiome in liver cirrhosis , 2014, Nature.

[18]  T. Kirikae,et al.  Parvimonas micra as a causative organism of spondylodiscitis: a report of two cases and a literature review. , 2014, International journal of infectious diseases : IJID : official publication of the International Society for Infectious Diseases.

[19]  Melissa J. Landrum,et al.  RefSeq: an update on mammalian reference sequences , 2013, Nucleic Acids Res..

[20]  Rob Patro,et al.  Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms , 2013, Nature Biotechnology.

[21]  Fredrik H. Karlsson,et al.  Gut metagenome in European women with normal, impaired and diabetic glucose control , 2013, Nature.

[22]  D. Kobayashi,et al.  Changes of the Intestinal Microbiota, Short Chain Fatty Acids, and Fecal pH in Patients with Colorectal Cancer , 2013, Digestive Diseases and Sciences.

[23]  Qiang Feng,et al.  A metagenome-wide association study of gut microbiota in type 2 diabetes , 2012, Nature.

[24]  M. Blaser,et al.  The human microbiome: at the interface of health and disease , 2012, Nature Reviews Genetics.

[25]  Peer Bork,et al.  The human small intestinal microbiota is driven by rapid uptake and conversion of simple carbohydrates , 2012, The ISME Journal.

[26]  Li C. Xia,et al.  Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads , 2011, PloS one.

[27]  P. Lambert,et al.  Propionibacterium acnes: infection beyond the skin , 2011, Expert review of anti-infective therapy.

[28]  Jesse R. Zaneveld,et al.  Human-associated microbial signatures: examining their predictive value. , 2011, Cell host & microbe.

[29]  R. Knight,et al.  Supervised classification of human microbiota. , 2011, FEMS microbiology reviews.

[30]  Min Zhang,et al.  Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors , 2010, Proceedings of the National Academy of Sciences.

[31]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[32]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[33]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[34]  D. Relman,et al.  An ecological and evolutionary perspective on human–microbe mutualism and disease , 2007, Nature.

[35]  G. Tannock What immunologists should know about bacterial communities of the human bowel. , 2007, Seminars in immunology.

[36]  S. Mazmanian,et al.  An Immunomodulatory Molecule of Symbiotic Bacteria Directs Maturation of the Host Immune System , 2005, Cell.

[37]  Forest Rohwer,et al.  Here a virus, there a virus, everywhere the same virus? , 2005, Trends in microbiology.

[38]  J. Handelsman,et al.  Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness , 2005, Applied and Environmental Microbiology.

[39]  Christian von Mering,et al.  STRING: known and predicted protein–protein associations, integrated and transferred across organisms , 2004, Nucleic Acids Res..

[40]  N. Brown,et al.  Real-time PCR investigation into the importance of Fusobacterium necrophorum as a cause of acute pharyngitis in general practice. , 2004, Journal of medical microbiology.

[41]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[42]  N. Moran,et al.  Non-cultivable microorganisms from symbiotic associations of insects and other hosts , 1997, Antonie van Leeuwenhoek.

[43]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[44]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[45]  Sebastian Bonhoeffer,et al.  Virus evolution: The importance of being erroneous , 2002, Nature.

[46]  J. Slots,et al.  Importance of Dialister pneumosintes in human periodontitis. , 2000, Oral microbiology and immunology.

[47]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[48]  W E Moore,et al.  The bacteria of periodontal diseases. , 1994, Periodontology 2000.

[49]  C. Record,et al.  Blood and brain concentrations of mercaptans in hepatic and methanethiol induced coma. , 1984, Gut.

[50]  D. Savage Microbial ecology of the gastrointestinal tract. , 1977, Annual review of microbiology.

[51]  H. C. Douglas,et al.  The Taxonomic Position of Corynebacterium acnes , 1946, Journal of bacteriology.