Dynamic Bayesian Networks for Integrating Multi-omics Time Series Microbiome Data

A key challenge in the analysis of longitudinal microbiome data is the inference of temporal interactions between microbial taxa, their genes, the metabolites they consume and produce, and host genes. To address these challenges we developed a computational pipeline, PALM, that first aligns multi-omics data and then uses dynamic Bayesian networks (DBNs) to reconstruct a unified model. Our approach overcomes differences in sampling and progression rates, utilizes a biologically-inspired multi-omic framework, reduces the large number of entities and parameters in the DBNs, and validates the learned network. Applying PALM to data collected from inflammatory bowel disease patients, we show that it accurately identifies known and novel interactions. Targeted experimental validations further support a number of the predicted novel metabolite-taxa interactions. Source code and data will be freely available after publication under the MIT Open Source license agreement on our GitHub page. IMPORTANCE While a number of large consortia are collecting and profiling several different types of microbiome and genomic time series data, very few methods exist for joint modeling of multi-omics data sets. We developed a new computational pipeline, PALM, which uses Dynamic Bayesian Networks (DBNs) and is designed to integrate multi-omics data from longitudinal microbiome studies. When used to integrate sequence, expression, and metabolomics data from microbiome samples along with host expression data, the resulting models identify interactions between taxa, their genes and the metabolites they produce and consume, and their impact on host expression. We tested the models both by using them to predict future changes in microbiome levels, and by comparing the learned interactions to known interactions in the literature. Finally, we performed experimental validations for a few of the predicted interactions to demonstrate the ability of the method to identify novel relationships and their impact.

[1]  D. Anderson HABs in a changing world: a perspective on harmful algal blooms, their impacts, and research and management in a dynamic era of climactic and environmental change. , 2014, Harmful algae 2012 : proceedings of the 15th International Conference on Harmful Algae : October 29 - November 2, 2012, CECO, Changwon, Gyeongnam, Korea. International Conference on Harmful Algae (15th : 2012 : Changwon, Gyeongnam, Kore....

[2]  Christopher E. McKinlay,et al.  Multi-omics analysis of inflammatory bowel disease. , 2014, Immunology letters.

[3]  Giri Narasimhan,et al.  So you think you can PLS-DA? , 2018 .

[4]  Jun Wang,et al.  ‘Multi-omic’ data analysis using O-miner , 2017, Briefings Bioinform..

[5]  C. Chassard,et al.  Assessment of bacterial diversity in breast milk using culture-dependent and culture-independent approaches. , 2013, The British journal of nutrition.

[6]  William D. Penny,et al.  Comparing Dynamic Causal Models using AIC, BIC and Free Energy , 2012, NeuroImage.

[7]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[8]  David J. T. Sumpter,et al.  Individual Rules for Trail Pattern Formation in Argentine Ants (Linepithema humile) , 2012, PLoS Comput. Biol..

[9]  Kevin S. Bonham,et al.  Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases , 2019, Nature.

[10]  Mathias Wilhelm,et al.  Global proteome analysis of the NCI-60 cell line panel. , 2013, Cell reports.

[11]  Casey M. Theriot,et al.  Metabolic Model-Based Integration of Microbiome Taxonomic and Metabolomic Profiles Elucidates Mechanistic Links between Ecological and Metabolic Variation , 2016, mSystems.

[12]  Georg K Gerber,et al.  The dynamic microbiome , 2014, FEBS letters.

[13]  A. Butte,et al.  The Integrative Human Microbiome Project: Dynamic Analysis of Microbiome-Host Omics Profiles during Periods of Human Health and Disease , 2014, Cell host & microbe.

[14]  A. O'Hagan,et al.  Kendall's Advanced Theory of Statistics, Vol. 2b: Bayesian Inference. , 1996 .

[15]  Peter D. Karp,et al.  The EcoCyc and MetaCyc databases , 2000, Nucleic Acids Res..

[16]  Robert F. Murphy,et al.  Quantifying the distribution of probes between subcellular locations using unsupervised pattern unmixing , 2010, Bioinform..

[17]  Shuzhao Li,et al.  Network-Based Approaches for Multi-omics Integration. , 2020, Methods in molecular biology.

[18]  Harald Steck,et al.  Learning the Bayesian Network Structure: Dirichlet Prior versus Data , 2008, UAI 2008.

[19]  Sebastián M. Real,et al.  E2F1 Regulates Cellular Growth by mTORC1 Signaling , 2011, PloS one.

[20]  Thomas Schiex,et al.  Gene Regulatory Network Reconstruction Using Bayesian Networks, the Dantzig Selector, the Lasso and Their Meta-Analysis , 2011, PloS one.

[21]  Insuk Lee,et al.  A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality , 2007, BMC Bioinformatics.

[22]  Samuel B. Fey,et al.  The under‐ice microbiome of seasonally frozen lakes , 2013 .

[23]  C. Huttenhower,et al.  Gut microbiome structure and metabolic activity in inflammatory bowel disease , 2018, Nature Microbiology.

[24]  Greg W. Clark,et al.  Panorama of ancient metazoan macromolecular complexes , 2015, Nature.

[25]  N. Wermuth,et al.  Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[26]  Harald Steck,et al.  Learning the Bayesian Network Structure: Dirichlet Prior vs Data , 2008, UAI.

[27]  Edward L. Huttlin,et al.  The BioPlex Network: A Systematic Exploration of the Human Interactome , 2015, Cell.

[28]  Lorenzo Beretta,et al.  Nearest neighbor imputation algorithms: a critical evaluation , 2016, BMC Medical Informatics and Decision Making.

[29]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[30]  Eran Elinav,et al.  Use of Metatranscriptomics in Microbiome Research , 2016, Bioinformatics and biology insights.

[31]  Alexander J. Hartemink,et al.  Learning Non-Stationary Dynamic Bayesian Networks , 2010, J. Mach. Learn. Res..

[32]  Scott T. Weiss,et al.  CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data , 2014, PLoS Comput. Biol..

[33]  Kevin P. Murphy,et al.  Dynamic Bayesian Networks for Audio-Visual Speech Recognition , 2002, EURASIP J. Adv. Signal Process..

[34]  Cranos M. Williams,et al.  Predicting gene regulatory networks by combining spatial and temporal gene expression data in Arabidopsis root stem cells , 2017, Proceedings of the National Academy of Sciences.

[35]  P. Christie The Mosaic Type IV Secretion Systems. , 2016, EcoSal Plus.

[36]  Giri Narasimhan,et al.  So you think you can PLS-DA? , 2017, BMC Bioinformatics.

[37]  Lawrence A. David,et al.  A phylogenetic transform enhances analysis of compositional microbiota data , 2016, bioRxiv.

[38]  J. H. van de Wijgert,et al.  A fruitful alliance: the synergy between Atopobium vaginae and Gardnerella vaginalis in bacterial vaginosis-associated biofilm , 2016, Sexually Transmitted Infections.

[39]  Tormod Næs,et al.  Characterizing mixed microbial population dynamics using time-series analysis , 2008, The ISME Journal.

[40]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[41]  Bernard M. Corfe,et al.  Dysbiosis of the gut microbiota in disease , 2015, Microbial ecology in health and disease.

[42]  Benoît Iung,et al.  Overview on Bayesian networks applications for dependability, risk analysis and maintenance areas , 2012, Eng. Appl. Artif. Intell..

[43]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[44]  Claire D. McWhite,et al.  Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes , 2017, Molecular systems biology.

[45]  Robert J. Palmer,et al.  Communication among Oral Bacteria , 2002, Microbiology and Molecular Biology Reviews.

[46]  Simeone Marino,et al.  Mathematical modeling of primary succession of murine intestinal microbiota , 2013, Proceedings of the National Academy of Sciences.

[47]  N. Stenseth,et al.  Convergent temporal dynamics of the human infant gut microbiota , 2010, The ISME Journal.

[48]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[49]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[50]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[51]  William D. Shannon,et al.  Patterned progression of bacterial populations in the premature infant gut , 2014, Proceedings of the National Academy of Sciences.

[52]  Timothy R. Cavagnaro,et al.  A Concise Review on Multi-Omics Data Integration for Terroir Analysis in Vitis vinifera , 2017, Front. Plant Sci..

[53]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[54]  Peer Bork,et al.  Extensive impact of non-antibiotic drugs on human gut bacteria , 2018, Nature.

[55]  Luke R. Thompson,et al.  Species-level functional profiling of metagenomes and metatranscriptomes , 2018, Nature Methods.

[56]  Chi Zhang,et al.  A new approach for multi-omic data integration , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[57]  A. Clark The Human Microbiome. , 2017, The American journal of nursing.

[58]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[59]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[60]  Ziv Bar-Joseph,et al.  DREM 2.0: Improved reconstruction of dynamic regulatory networks from time-series expression data , 2012, BMC Systems Biology.

[61]  Zaid Abdo,et al.  Temporal Dynamics of the Human Vaginal Microbiota , 2012, Science Translational Medicine.

[62]  Xavier Daura,et al.  Understanding the Molecular Determinants Driving the Immunological Specificity of the Protective Pilus 2a Backbone Protein of Group B Streptococcus , 2013, PLoS Comput. Biol..

[63]  Pratik D Jagtap,et al.  Multi-omic data analysis using Galaxy , 2015, Nature Biotechnology.

[64]  Estelle Glory-Afshar,et al.  Determining the distribution of probes between different subcellular locations through automated unmixing of subcellular patterns , 2010, Proceedings of the National Academy of Sciences.

[65]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[66]  James T. Van Leuven,et al.  Modeling time-series data from microbial communities , 2016, The ISME Journal.

[67]  M. Pop,et al.  Identification of microbiota dynamics using robust parameter estimation methods. , 2017, Mathematical biosciences.

[68]  Mark Craven,et al.  Clustered alignments of gene-expression time series data , 2009, Bioinform..

[69]  Z. Abdo,et al.  Effects of tampons and menses on the composition and diversity of vaginal microbial communities over time , 2013, BJOG : an international journal of obstetrics and gynaecology.

[70]  Georg K. Gerber,et al.  Inferring Dynamic Signatures of Microbes in Complex Host Ecosystems , 2012, PLoS Comput. Biol..

[71]  Travis E. Gibson,et al.  Robust and Scalable Models of Microbiome Dynamics , 2018, ICML.

[72]  E. Martínez-García,et al.  Stationary phase in gram-negative bacteria. , 2010, FEMS microbiology reviews.

[73]  Aidong Zhang,et al.  Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE) , 2019, BMC Genomics.

[74]  P. Poole,et al.  The plant microbiome , 2013, Genome Biology.

[75]  James T. Morton,et al.  Establishing microbial composition measurement standards with reference frames , 2019, Nature Communications.

[76]  Marco Y. Hein,et al.  A Human Interactome in Three Quantitative Dimensions Organized by Stoichiometries and Abundances , 2015, Cell.

[77]  H. Boyer,et al.  A complementation analysis of the restriction and modification of DNA in Escherichia coli. , 1969, Journal of molecular biology.

[78]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[79]  C. Huttenhower,et al.  Dynamics of metatranscription in the inflammatory bowel disease gut microbiome , 2018, Nature Microbiology.

[80]  R. Knight,et al.  The Human Microbiome Project , 2007, Nature.

[81]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[82]  George M. Church,et al.  Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[83]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[84]  Bartek Wilczynski,et al.  BNFinder: exact and efficient method for learning Bayesian networks , 2008, Bioinform..

[85]  Dennis Vitkup,et al.  Quantifying spatiotemporal variability and noise in absolute microbiota abundances using replicate sampling , 2019, Nature Methods.

[86]  Radu Marculescu,et al.  Inferring Microbial Interactions from Metagenomic Time-series Using Prior Biological Knowledge , 2017, BCB.

[87]  Karsten Zengler,et al.  The challenges of integrating multi-omic data sets. , 2010, Nature chemical biology.

[88]  M. Blaser,et al.  The human microbiome: at the interface of health and disease , 2012, Nature Reviews Genetics.

[89]  Jennifer M. Fettweis,et al.  The Integrative Human Microbiome Project , 2019, Nature.

[90]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[91]  James R. Foulds,et al.  Learning accurate representations of microbe-metabolite interactions , 2019, Nature Methods.

[92]  Hiroyuki Kubota,et al.  Trans-Omics: How To Reconstruct Biochemical Networks Across Multiple 'Omic' Layers. , 2016, Trends in biotechnology.

[93]  M A Krohn,et al.  Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation , 1991, Journal of clinical microbiology.

[94]  Christine L. Sun,et al.  Temporal and spatial variation of the human microbiota during pregnancy , 2015, Proceedings of the National Academy of Sciences.

[95]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[96]  Tommi S. Jaakkola,et al.  Continuous Representations of Time-Series Gene Expression Data , 2003, J. Comput. Biol..

[97]  David J. Beale,et al.  Beyond Metabolomics: A Review of Multi-Omics-Based Approaches , 2016 .

[98]  P. Gajer,et al.  Vaginal microbiome of reproductive-age women , 2010, Proceedings of the National Academy of Sciences.

[99]  L. T. Angenent,et al.  Succession of microbial consortia in the developing infant gut microbiome , 2010, Proceedings of the National Academy of Sciences.

[100]  Dan S. Tawfik Messy biology and the origins of evolutionary innovations. , 2010, Nature chemical biology.

[101]  Gunnar Rätsch,et al.  Ecological Modeling from Time-Series Inference: Insight into Dynamics and Stability of Intestinal Microbiota , 2013, PLoS Comput. Biol..

[102]  Wibke Busch,et al.  Prospects and challenges of multi-omics data integration in toxicology , 2020, Archives of Toxicology.

[103]  J. Handelsman,et al.  Metagenomics: genomic analysis of microbial communities. , 2004, Annual review of genetics.

[104]  M. Vaneechoutte,et al.  Lactobacillus iners: Friend or Foe? , 2017, Trends in microbiology.

[105]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[106]  Arun K. Ramani,et al.  How complete are current yeast and human protein-interaction networks? , 2006, Genome Biology.

[107]  Charlotte M. Deane,et al.  What Evidence Is There for the Homology of Protein-Protein Interactions? , 2012, PLoS Comput. Biol..

[108]  C. Huttenhower,et al.  Predictive metabolomic profiling of microbial communities using amplicon or metagenomic sequences , 2019, Nature Communications.

[109]  Tomi Silander,et al.  On Sensitivity of the MAP Bayesian Network Structure to the Equivalent Sample Size Parameter , 2007, UAI.

[110]  B. Holloway Genetic recombination in Pseudomonas aeruginosa. , 1955, Journal of general microbiology.

[111]  P. Turnbaugh,et al.  An Invitation to the Marriage of Metagenomics and Metabolomics , 2008, Cell.

[112]  Geoffrey Zweig,et al.  Speech Recognition with Dynamic Bayesian Networks , 1998, AAAI/IAAI.

[113]  E. Castro-Nallar,et al.  Integrating microbial and host transcriptomics to characterize asthma-associated microbial communities , 2015, BMC Medical Genomics.

[114]  Eddy J. Bautista,et al.  Longitudinal multi-omics of host–microbe dynamics in prediabetes , 2019, Nature.

[115]  Giri Narasimhan,et al.  Dynamic interaction network inference from longitudinal microbiome data , 2018 .

[116]  S. Abbott,et al.  16S rRNA Gene Sequencing for Bacterial Identification in the Diagnostic Laboratory: Pluses, Perils, and Pitfalls , 2007, Journal of Clinical Microbiology.

[117]  Scott T. Weiss,et al.  Longitudinal Prediction of the Infant Gut Microbiome with Dynamic Bayesian Networks , 2016, Scientific Reports.

[118]  William Stafford Noble,et al.  Dynamic Bayesian Network for Accurate Detection of Peptides from Tandem Mass Spectra. , 2016, Journal of proteome research.

[119]  Karsten Zengler,et al.  A Novel Sparse Compositional Technique Reveals Microbial Perturbations , 2019, mSystems.

[120]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.