Big data in IBD: big progress for clinical practice

IBD is a complex multifactorial inflammatory disease of the gut driven by extrinsic and intrinsic factors, including host genetics, the immune system, environmental factors and the gut microbiome. Technological advancements such as next-generation sequencing, high-throughput omics data generation and molecular networks have catalysed IBD research. The advent of artificial intelligence, in particular, machine learning, and systems biology has opened the avenue for the efficient integration and interpretation of big datasets for discovering clinically translatable knowledge. In this narrative review, we discuss how big data integration and machine learning have been applied to translational IBD research. Approaches such as machine learning may enable patient stratification, prediction of disease progression and therapy responses for fine-tuning treatment options with positive impacts on cost, health and safety. We also outline the challenges and opportunities presented by machine learning and big data in clinical IBD research.

[1]  Ahmed Enayetallah,et al.  Robust clinical outcome prediction based on Bayesian analysis of transcriptional profiles and prior causal networks , 2014, Bioinform..

[2]  P. Higgins,et al.  Machine Learning Algorithms for Objective Remission and Clinical Outcomes with Thiopurines , 2017, Journal of Crohn's & colitis.

[3]  Ryan W. Stidham,et al.  Performance of a Deep Learning Model vs Human Reviewers in Grading Endoscopic Disease Severity of Patients With Ulcerative Colitis , 2019, JAMA network open.

[4]  Leonard W. D'Avolio,et al.  Evaluation of a generalizable approach to clinical information retrieval using the automated retrieval console (ARC) , 2010, J. Am. Medical Informatics Assoc..

[5]  Morteza Mohammad Noori,et al.  Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features , 2014, PLoS Comput. Biol..

[6]  Marc-Thorsten Hütt,et al.  Uncoupling of mucosal gene regulation, mRNA splicing and adherent microbiota signatures in inflammatory bowel disease , 2016, Gut.

[7]  Andrew Kusiak,et al.  Data mining and genetic algorithm based gene/SNP selection , 2004, Artif. Intell. Medicine.

[8]  K. Weigel,et al.  Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers. , 2007, Developments in biologicals.

[9]  Maria Victoria Schneider,et al.  Next generation of network medicine: interdisciplinary signaling approaches. , 2017, Integrative biology : quantitative biosciences from nano to macro.

[10]  Zhen Lin,et al.  Choosing SNPs using feature selection , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[11]  Julio Saez-Rodriguez,et al.  OmniPath: guidelines and gateway for literature-curated signaling pathway resources , 2016, Nature Methods.

[12]  Ofer Isakov,et al.  Machine Learning–Based Gene Prioritization Identifies Novel Candidate Risk Genes for Inflammatory Bowel Disease , 2017, Inflammatory bowel diseases.

[13]  H. Riper,et al.  Predicting Therapy Success and Costs for Personalized Treatment Recommendations Using Baseline Characteristics: Data-Driven Analysis , 2018, Journal of medical Internet research.

[14]  R. Caprioli,et al.  Proteomic patterns of colonic mucosal tissues delineate Crohn's colitis and ulcerative colitis , 2013, Proteomics. Clinical applications.

[15]  Nataša Pržulj,et al.  Methods for biological data integration: perspectives and challenges , 2015, Journal of The Royal Society Interface.

[16]  J. Grimsby,et al.  RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues , 2019, Science.

[17]  G. Su,et al.  Assessing Small Bowel Stricturing and Morphology in Crohn's Disease Using Semi-automated Image Analysis. , 2019, Inflammatory bowel diseases.

[18]  Dongsup Kim,et al.  Inferring Crohn’s disease association from exome sequences by integrating biological knowledge , 2016, BMC Medical Genomics.

[19]  Joaquín Dopazo,et al.  Prophet, a web-based tool for class prediction using microarray data , 2007, Bioinform..

[20]  Miles Parkes,et al.  Gene expression profiling of CD8+ T cells predicts prognosis in patients with Crohn disease and ulcerative colitis. , 2011, The Journal of clinical investigation.

[21]  Judy H. Cho,et al.  [Letters to Nature] , 1975, Nature.

[22]  M. Kaplan,et al.  Effector T Helper Cell Subsets in Inflammatory Bowel Diseases , 2018, Front. Immunol..

[23]  J. Stoker,et al.  ECCO-ESGAR Guideline for Diagnostic Assessment in IBD Part 2: IBD scores and general principles and technical aspects. , 2018, Journal of Crohn's & colitis.

[24]  Christopher F. Martin,et al.  Use of Biologic Therapy by Pregnant Women With Inflammatory Bowel Disease Does Not Affect Infant Response to Vaccines , 2018, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[25]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[26]  Ji Won Kim,et al.  RNA-seq Reveals Transcriptomic Differences in Inflamed and Noninflamed Intestinal Mucosa of Crohn's Disease Patients Compared with Normal Mucosa of Healthy Controls , 2017, Inflammatory bowel diseases.

[27]  Russ B. Altman,et al.  A probabilistic pathway score (PROPS) for classification with applications to inflammatory bowel disease , 2017, Bioinform..

[28]  Jenny Sauk,et al.  Disease-Specific Alterations in the Enteric Virome in Inflammatory Bowel Disease , 2015, Cell.

[29]  T. Vatanen,et al.  Dysbiosis, inflammation, and response to treatment: a longitudinal study of pediatric subjects with newly diagnosed inflammatory bowel disease , 2016, Genome Medicine.

[30]  S. Horvath,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[31]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[32]  James T. Morton,et al.  Impacts of the Human Gut Microbiome on Therapeutics. , 2018, Annual review of pharmacology and toxicology.

[33]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[34]  Judy H. Cho,et al.  Comparative performances of machine learning methods for classifying Crohn Disease patients using genome-wide genotyping data , 2019, Scientific Reports.

[35]  Kai-Yao Huang,et al.  Increase Trichomonas vaginalis detection based on urine routine analysis through a machine learning approach , 2019, Scientific Reports.

[36]  Edsel A Peña,et al.  Dynamic Modelling and Statistical Analysis of Event Times. , 2006, Statistical science : a review journal of the Institute of Mathematical Statistics.

[37]  M. Schmitt,et al.  Classification of inflammatory bowel diseases by means of Raman spectroscopic imaging of epithelium cells. , 2012, Journal of biomedical optics.

[38]  Dursun Delen,et al.  Predicting and explaining inflammation in Crohn’s disease patients using predictive analytics methods and electronic medical record data , 2019, Health Informatics J..

[39]  Timothy L. Tickle,et al.  Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment , 2012, Genome Biology.

[40]  R. Xavier,et al.  Multiomics Analyses to Deliver the Most Effective Treatment to Every Patient With Inflammatory Bowel Disease. , 2018, Gastroenterology.

[41]  Richard Hansen,et al.  Multi-omics differentially classify disease state and treatment outcome in pediatric Crohn’s disease , 2018, Microbiome.

[42]  X. Jiang,et al.  Bioinformatic analysis of potential candidates for therapy of inflammatory bowel disease. , 2015, European review for medical and pharmacological sciences.

[43]  Aviv Regev,et al.  Intra- and Inter-cellular Rewiring of the Human Colon during Ulcerative Colitis , 2019, Cell.

[44]  Alberto D. Pascual-Montano,et al.  A survey of dimensionality reduction techniques , 2014, ArXiv.

[45]  Wei Zhang,et al.  Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. , 2018, Cell systems.

[46]  Francisco Guarner,et al.  The gut microbiota in IBD , 2012, Nature Reviews Gastroenterology &Hepatology.

[47]  Leontios J Hadjileontiadis,et al.  Potential of hybrid adaptive filtering in inflammatory lesion detection from capsule endoscopy images , 2016, World journal of gastroenterology.

[48]  Kevin S. Bonham,et al.  Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases , 2019, Nature.

[49]  Jun Yu,et al.  Gut mucosal virome alterations in ulcerative colitis , 2019, Gut.

[50]  Akbar K Waljee,et al.  Predicting Hospitalization and Outpatient Corticosteroid Use in Inflammatory Bowel Disease Patients Using Machine Learning. , 2017, Inflammatory bowel diseases.

[51]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[52]  J. Zhu,et al.  Predicting corticosteroid‐free endoscopic remission with vedolizumab in ulcerative colitis , 2018, Alimentary pharmacology & therapeutics.

[53]  P. Schloss,et al.  Fecal Microbiota Signatures Are Associated with Response to Ustekinumab Therapy among Crohn’s Disease Patients , 2018, mBio.

[54]  A. Toubert,et al.  CD4+NKG2D+ T cells in Crohn's disease mediate inflammatory and cytotoxic responses through MICA interactions. , 2007, Gastroenterology.

[55]  Sijian Wang,et al.  Algorithms outperform metabolite tests in predicting response of patients with inflammatory bowel disease to thiopurines. , 2010, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[56]  J. Stoker,et al.  ECCO-ESGAR Guideline for Diagnostic Assessment in IBD Part 1: Initial diagnosis, monitoring of known IBD, detection of complications. , 2018, Journal of Crohn's & colitis.

[57]  S. Vermeire,et al.  Mucosal IL13RA2 expression predicts nonresponse to anti‐TNF therapy in Crohn's disease , 2019, Alimentary pharmacology & therapeutics.

[58]  D. Iliopoulos,et al.  Environmental triggers in IBD: a review of progress and evidence , 2018, Nature Reviews Gastroenterology & Hepatology.

[59]  Shawn N Murphy,et al.  The Association Between Arthralgia and Vedolizumab Using Natural Language Processing. , 2018, Inflammatory bowel diseases.

[60]  Leonard W. D'Avolio,et al.  Automated Identification of Surveillance Colonoscopy in Inflammatory Bowel Disease Using Natural Language Processing , 2013, Digestive Diseases and Sciences.

[61]  S. Szymczak,et al.  Sparse Modeling Reveals miRNA Signatures for Diagnostics of Inflammatory Bowel Disease , 2015, PloS one.

[62]  P. Rutgeerts,et al.  Gene and Mirna Regulatory Networks During Different Stages of Crohn's Disease. , 2019, Journal of Crohn's & colitis.

[63]  K. Ngiam,et al.  Big data and machine learning algorithms for health-care delivery. , 2019, The Lancet. Oncology.

[64]  S. Vermeire,et al.  New treatment options for inflammatory bowel diseases , 2018, Journal of Gastroenterology.

[65]  A. Villani,et al.  IL-12 and mucosal CD14+ monocyte-like cells induce IL-8 in colonic memory CD4+ T cells of patients with Ulcerative colitis but not Crohn's disease. , 2019, Journal of Crohn's & colitis.

[66]  Keith W. Jones,et al.  Genome-Wide Maps of Circulating miRNA Biomarkers for Ulcerative Colitis , 2012, PloS one.

[67]  M. Huss,et al.  A primer on deep learning in genomics , 2018, Nature Genetics.

[68]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[69]  Se Jin Song,et al.  The treatment-naive microbiome in new-onset Crohn's disease. , 2014, Cell host & microbe.

[70]  H. Hakonarson,et al.  Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease. , 2013, American journal of human genetics.

[71]  Qiu Zhao,et al.  WGCNA Reveals Key Roles of IL8 and MMP-9 in Progression of Involvement Area in Colon of Patients with Ulcerative Colitis , 2018, Current Medical Science.

[72]  S. Vermeire,et al.  Scoring endoscopic disease activity in IBD: artificial intelligence sees more and better than we do , 2019, Gut.

[73]  C. Urban,et al.  Evasion of Immune Surveillance in Low Oxygen Environments Enhances Candida albicans Virulence , 2018, mBio.

[74]  Masa Umicevic Mirkov,et al.  Genetics of inflammatory bowel disease: beyond NOD2. , 2017, The lancet. Gastroenterology & hepatology.

[75]  T. Ideker,et al.  A new approach to decoding life: systems biology. , 2001, Annual review of genomics and human genetics.

[76]  Tamás Korcsmáros,et al.  Omics Approaches to Identify Potential Biomarkers of Inflammatory Diseases in the Focal Adhesion Complex , 2017, Genom. Proteom. Bioinform..

[77]  Ashwin N. Ananthakrishnan,et al.  Epidemiology and risk factors for IBD , 2015, Nature Reviews Gastroenterology &Hepatology.

[78]  Yu-Dong Cai,et al.  Identification of Candidate Genes Related to Inflammatory Bowel Disease Using Minimum Redundancy Maximum Relevance, Incremental Feature Selection, and the Shortest-Path Approach , 2017, BioMed research international.

[79]  F. Agakov,et al.  Application of high-dimensional feature selection: evaluation for genomic prediction in man , 2015, Scientific Reports.

[80]  Eddy J. Bautista,et al.  Longitudinal multi-omics of host–microbe dynamics in prediabetes , 2019, Nature.

[81]  Lisa M Bramer,et al.  Dynamics of the human gut microbiome in Inflammatory Bowel Disease , 2017, Nature Microbiology.

[82]  Tom Eelbode,et al.  Automatic, computer-aided determination of endoscopic and histological inflammation in patients with mild to moderate ulcerative colitis based on red density , 2020, Gut.

[83]  Edoardo Pasolli,et al.  Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights , 2016, PLoS Comput. Biol..

[84]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[85]  M. Tremelling,et al.  A systems genomics approach to uncover patient-specific pathogenic pathways and proteins in a complex disease , 2019, bioRxiv.

[86]  D. Hommes,et al.  Effect of tight control management on Crohn's disease (CALM): a multicentre, randomised, controlled phase 3 trial , 2017, The Lancet.

[87]  I. Kohane,et al.  Improving Case Definition of Crohn's Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processing: A Novel Informatics Approach , 2013, Inflammatory bowel diseases.

[88]  Gregory D. Hager,et al.  Assessment of Crohn’s Disease Lesions in Wireless Capsule Endoscopy Images , 2012, IEEE Transactions on Biomedical Engineering.

[89]  S. Horvath,et al.  Microgeographic Proteomic Networks of the Human Colonic Mucosa and Their Association With Inflammatory Bowel Disease , 2016, Cellular and molecular gastroenterology and hepatology.

[90]  Liu Hong,et al.  Role of MiRNAs in Inflammatory Bowel Disease , 2017, Digestive Diseases and Sciences.

[91]  Tariq Ahmad,et al.  Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study , 2016, The Lancet.

[92]  Cesare Furlanello,et al.  Phylogenetic convolutional neural networks in metagenomics , 2017, BMC Bioinformatics.

[93]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[94]  Viju Raghupathi,et al.  Big data analytics in healthcare: promise and potential , 2014, Health Information Science and Systems.

[95]  Jaap Stoker,et al.  A computer-assisted model for detection of MRI signs of Crohn’s disease activity: future or fiction? , 2011, Abdominal Imaging.

[96]  Marc-Thorsten Hütt,et al.  Distinct metabolic network states manifest in the gene expression profiles of pediatric inflammatory bowel disease patients and controls , 2016, Scientific Reports.

[97]  Tariq Ahmad,et al.  Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease , 2017, Nature Genetics.

[98]  David R. Kelley,et al.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015, bioRxiv.

[99]  Joachim M. Buhmann,et al.  Semi-Supervised and Active Learning for Automatic Segmentation of Crohn's Disease , 2013, MICCAI.

[100]  Loukas Moutsianas,et al.  Exploring the genetic architecture of inflammatory bowel disease , 2016 .

[101]  Brandi L. Cantarel,et al.  Integrated Metagenomics/Metaproteomics Reveals Human Host-Microbiota Signatures of Crohn's Disease , 2012, PloS one.

[102]  C. Fiocchi,et al.  Immunopathogenesis of IBD: current state of the art , 2016, Nature Reviews Gastroenterology &Hepatology.

[103]  J. Gisbert,et al.  Extracolonic Cancer in Inflammatory Bowel Disease: Data from the GETECCU Eneida Registry , 2017, The American Journal of Gastroenterology.

[104]  Diogo M. Camacho,et al.  Next-Generation Machine Learning for Biological Networks , 2018, Cell.

[105]  D. Hommes,et al.  Cohort profile: design and first results of the Dutch IBD Biobank: a prospective, nationwide biobank of patients with inflammatory bowel disease , 2017, BMJ Open.

[106]  Mahmoud Torabinejad,et al.  Cytotoxicity and Antimicrobial Effects of a New Fast-Set MTA , 2017, BioMed research international.

[107]  Leroy Hood,et al.  Systems Approaches to Biology and Disease Enable Translational Systems Medicine , 2012, Genom. Proteom. Bioinform..

[108]  Sharon I. Greenblum,et al.  Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease , 2011, Proceedings of the National Academy of Sciences.

[109]  Xiaohang Wu,et al.  Diagnostic Efficacy and Therapeutic Decision-making Capacity of an Artificial Intelligence Platform for Childhood Cataracts in Eye Clinics: A Multicentre Randomized Controlled Trial , 2019, EClinicalMedicine.

[110]  Xavier Llor,et al.  Identification of Novel Predictor Classifiers for Inflammatory Bowel Disease by Gene Expression Profiling , 2013, PloS one.

[111]  S. Vermeire,et al.  Differential diagnosis of inflammatory bowel disease: imitations and complications. , 2018, The lancet. Gastroenterology & hepatology.

[112]  Eddy J. Bautista,et al.  Integrative Personal Omics Profiles during Periods of Weight Gain and Loss. , 2018, Cell systems.

[113]  R. Sartor,et al.  Predicting Risk of Postoperative Disease Recurrence in Crohn's Disease: Patients With Indolent Crohn's Disease Have Distinct Whole Transcriptome Profiles at the Time of First Surgery. , 2018, Inflammatory bowel diseases.

[114]  Liming Wang,et al.  An artificial intelligence platform for the multihospital collaborative management of congenital cataracts , 2017, Nature Biomedical Engineering.

[115]  Richard Bonneau,et al.  Integrated Analysis of Biopsies from Inflammatory Bowel Disease Patients Identifies SAA1 as a Link Between Mucosal Microbes with TH17 and TH22 Cells , 2017, Inflammatory bowel diseases.

[116]  J. Danesh,et al.  NOX1 loss-of-function genetic variants in patients with inflammatory bowel disease , 2017, Mucosal Immunology.

[117]  Akbar K Waljee,et al.  Predicting Corticosteroid-Free Biologic Remission with Vedolizumab in Crohn's Disease. , 2018, Inflammatory bowel diseases.

[118]  L. Croner,et al.  Combined Serological, Genetic, and Inflammatory Markers Differentiate Non-IBD, Crohn's Disease, and Ulcerative Colitis Patients , 2013, Inflammatory bowel diseases.

[119]  Agnieszka Smolinska,et al.  The fecal microbiota as a biomarker for disease activity in Crohn’s disease , 2016, Scientific Reports.