Accurate Five-category Classification for Colorectal Cancer Using Gut Microbiome 16S rRNA Sequencing

The association between the gut microbiome and the five stages of colorectal cancer (CRC) (healthy, polyposis, nonadvanced adenoma, advanced adenoma, and cancer) remains unclear. We performed 16S rRNA sequencing of the V3-V4 amplicon from 999 samples from subjects at various stages of CRC development and constructed an accurate predictive random forest model for CRC development. In the testing set, our five-category CRC prediction classifier had accuracies of 0.84 and 0.74 using the relative operational taxonomic unit (OTU) abundances and relative genus abundances, respectively. Specifically, the OTU-based classifier had a sensitivity of 0.97 and specificity of 0.97 for CRC samples, and the genus-based classifier had a sensitivity of 0.97 and specificity of 0.95 for CRC samples. Meanwhile, the gut microbiota was found to differ at all stages of CRC development. The differential abundances of closely related bacteria were used to accurately classify the five stages of CRC development. Additionally, both unannotated and annotated OTUs played important roles in classifier modelling. Our work not only provides valuable 16S rRNA sequencing data from patients and healthy individuals on a large scale but also identifies reproducible gut microbiome biomarkers for CRC staging and highlights their potential applications as noninvasive microbiome biomarkers for diagnosis and as predictive CRC screening tests.

[1]  A. Jemal,et al.  Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries , 2021, CA: a cancer journal for clinicians.

[2]  N. Chalasani,et al.  Effects of Rare Microbiome Taxa Filtering on Statistical Analysis , 2021, Frontiers in Microbiology.

[3]  Yunwei Wei,et al.  Prediction of Postoperative Ileus in Patients With Colorectal Cancer by Preoperative Gut Microbiota , 2020, Frontiers in Oncology.

[4]  Gavin M Douglas,et al.  PICRUSt2 for prediction of metagenome functions , 2020, Nature Biotechnology.

[5]  P. Rosenstiel,et al.  Comparative analysis of amplicon and metagenomic sequencing methods reveals key features in the evolution of animal metaorganisms , 2019, Microbiome.

[6]  P. Bork,et al.  Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation , 2019, Nature Medicine.

[7]  Peer Bork,et al.  Microbial abundance, activity and population genomic profiling with mOTUs2 , 2019, Nature Communications.

[8]  Y. Kawarabayasi,et al.  Comparison of the microbial community structure between inflamed and non‐inflamed sites in patients with ulcerative colitis , 2018, Journal of gastroenterology and hepatology.

[9]  Rick L. Stevens,et al.  A communal catalogue reveals Earth’s multiscale microbial diversity , 2017, Nature.

[10]  R. Palmqvist,et al.  Cancer‐associated fecal microbial markers in colorectal cancer detection , 2017, International journal of cancer.

[11]  N. Qin,et al.  Dysbiosis signature of mycobiota in colon polyp and colorectal cancer , 2017, European Journal of Clinical Microbiology & Infectious Diseases.

[12]  Qiang Feng,et al.  Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer , 2015, Gut.

[13]  Herbert Tilg,et al.  Gut microbiome development along the colorectal adenoma-carcinoma sequence , 2015 .

[14]  Jens Roat Kultima,et al.  Potential of fecal microbiota for early‐stage detection of colorectal cancer , 2014 .

[15]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[16]  Robert C. Edgar,et al.  UPARSE: highly accurate OTU sequences from microbial amplicon reads , 2013, Nature Methods.

[17]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[18]  Belgin Dogan,et al.  Intestinal Inflammation Targets Cancer-Inducing Activity of the Microbiota , 2012, Science.

[19]  Joseph Dien,et al.  Applying Principal Components Analysis to Event-Related Potentials: A Tutorial , 2012, Developmental neuropsychology.

[20]  C. Huttenhower,et al.  Metagenomic microbial community profiling using unique clade-specific marker genes , 2012, Nature Methods.

[21]  B. Birren,et al.  Genomic analysis identifies association of Fusobacterium with colorectal carcinoma. , 2012, Genome research.

[22]  Steven Salzberg,et al.  BIOINFORMATICS ORIGINAL PAPER , 2004 .

[23]  P. Woster,et al.  Polyamine catabolism contributes to enterotoxigenic Bacteroides fragilis-induced colon tumorigenesis , 2011, Proceedings of the National Academy of Sciences.

[24]  C. Huttenhower,et al.  Metagenomic biomarker discovery and explanation , 2011, Genome Biology.

[25]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[26]  Cynthia L Sears,et al.  A human colonic commensal promotes colon tumorigenesis via activation of T helper type 17 T cell responses , 2009, Nature Medicine.

[27]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[28]  Jaai Kim,et al.  Group-specific primer and probe sets to detect methanogenic communities using quantitative real-time polymerase chain reaction. , 2005, Biotechnology and bioengineering.

[29]  Mark M Huycke,et al.  Enterococcus faecalis produces extracellular superoxide and hydrogen peroxide that damages colonic epithelial cell DNA. , 2002, Carcinogenesis.