Compositional zero-inflated network estimation for microbiome data

Background The estimation of microbial networks can provide important insight into the ecological relationships among the organisms that comprise the microbiome. However, there are a number of critical statistical challenges in the inference of such networks from high-throughput data. Since the abundances in each sample are constrained to have a fixed sum and there is incomplete overlap in microbial populations across subjects, the data are both compositional and zero-inflated. Results We propose the COmpositional Zero-Inflated Network Estimation (COZINE) method for inference of microbial networks which addresses these critical aspects of the data while maintaining computational scalability. COZINE relies on the multivariate Hurdle model to infer a sparse set of conditional dependencies which reflect not only relationships among the continuous values, but also among binary indicators of presence or absence and between the binary and continuous representations of the data. Our simulation results show that the proposed method is better able to capture various types of microbial relationships than existing approaches. We demonstrate the utility of the method with an application to understanding the oral microbiome network in a cohort of leukemic patients. Conclusions Our proposed method addresses important challenges in microbiome network estimation, and can be effectively applied to discover various types of dependence relationships in microbial communities. The procedure we have developed, which we refer to as COZINE, is available online at https://github.com/MinJinHa/COZINE.

[1]  Christian L. Müller,et al.  Microbial Networks in SPRING - Semi-parametric Rank-Based Correlation and Partial Correlation Estimation for Quantitative Microbiome Data , 2019, bioRxiv.

[2]  Ali Shojaie,et al.  Selection and estimation for mixed graphical models. , 2013, Biometrika.

[3]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  E. Le Chatelier,et al.  Gut microbiome modulates response to anti–PD-1 immunotherapy in melanoma patients , 2018, Science.

[5]  Andrew McDavid,et al.  GRAPHICAL MODELS FOR ZERO-INFLATED SINGLE CELL GENE EXPRESSION. , 2016, The annals of applied statistics.

[6]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[7]  Gerard Muyzer,et al.  A comparison of taxon co-occurrence patterns for macro- and microorganisms. , 2007, Ecology.

[8]  S. Mazmanian,et al.  The gut microbiota shapes intestinal immune responses during health and disease , 2009, Nature Reviews Immunology.

[9]  Michael I. Jordan Graphical Models , 1998 .

[10]  Wei Lin,et al.  Large Covariance Estimation for Compositional Data Via Composition-Adjusted Thresholding , 2016, Journal of the American Statistical Association.

[11]  J. Goedert,et al.  Human gut microbiome and risk for colorectal cancer. , 2013, Journal of the National Cancer Institute.

[12]  Hubert Rehrauer,et al.  A global network of coexisting microbes from environmental and whole-genome sequence data. , 2010, Genome research.

[13]  Christine B. Peterson,et al.  Gut Microbiome Signatures are Predictive of Infectious Risk Following Induction Therapy for Acute Myeloid Leukemia. , 2019, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[14]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[15]  S. Turroni,et al.  The Same Microbiota and a Potentially Discriminant Metabolome in the Saliva of Omnivore, Ovo-Lacto-Vegetarian and Vegan Individuals , 2014, PloS one.

[16]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[17]  Lingling An,et al.  Investigating microbial co-occurrence patterns based on metagenomic compositional data , 2015, Bioinform..

[18]  Christian L. Müller,et al.  Sparse and Compositionally Robust Inference of Microbial Ecological Networks , 2014, PLoS Comput. Biol..

[19]  Yuqing Yang,et al.  Inference of Environmental Factor-Microbe and Microbe-Microbe Associations from Metagenomic Data Using a Hierarchical Bayesian Statistical Model. , 2017, Cell systems.

[20]  Chenping Zhang,et al.  Variations in oral microbiota associated with oral cancer , 2017, Scientific Reports.

[21]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[22]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[23]  HUAYING FANG,et al.  gCoda: Conditional Dependence Network Inference for Compositional Data , 2017, J. Comput. Biol..

[24]  Arthur Kaser,et al.  Gut microbiome, obesity, and metabolic dysfunction. , 2011, The Journal of clinical investigation.

[25]  LambertDiane Zero-inflated Poisson regression, with an application to defects in manufacturing , 1992 .

[26]  Jonathan Friedman,et al.  Inferring Correlation Networks from Genomic Survey Data , 2012, PLoS Comput. Biol..

[27]  C. Schadt,et al.  Linking Associations of Rare Low-Abundance Species to Their Environments by Association Networks , 2018, Front. Microbiol..

[28]  E. Borenstein,et al.  Patterns of salivary microbiota injury and oral mucositis in recipients of allogeneic hematopoietic stem cell transplantation. , 2020, Blood advances.

[29]  Hongyu Zhao,et al.  CCLasso: correlation inference for compositional data through Lasso , 2015, Bioinform..

[30]  Natalia N. Ivanova,et al.  Symbiosis insights through metagenomic analysis of a microbial consortium. , 2006, Nature Reviews Microbiology.

[31]  Pradeep Ravikumar,et al.  Mixed Graphical Models via Exponential Families , 2014, AISTATS.

[32]  Jizhong Zhou,et al.  Preliminary analysis of salivary microbiome and their potential roles in oral lichen planus , 2016, Scientific Reports.

[33]  Tianxi Li,et al.  High-Dimensional Mixed Graphical Models , 2013, 1304.2810.

[34]  J. Aitchison A new approach to null correlations of proportions , 1981 .

[35]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[36]  K. Peck,et al.  Bloodstream infections in adult patients with cancer: clinical features and pathogenic significance of Staphylococcus aureus bacteremia , 2012, Supportive Care in Cancer.

[37]  Curtis Huttenhower,et al.  Microbial Co-occurrence Relationships in the Human Microbiome , 2012, PLoS Comput. Biol..

[38]  Blair J. Rossetti,et al.  Biogeography of a human oral microbiome at the micron scale , 2016, Proceedings of the National Academy of Sciences.

[39]  R. Jenq,et al.  Intestinal microbiota-related effects on graft-versus-host disease , 2015, International Journal of Hematology.

[40]  M. Brennan,et al.  Lasting Gammaproteobacteria profile changes characterized hematological cancer patients who developed oral mucositis following conditioning therapy , 2020, Journal of oral microbiology.

[41]  James Versalovic,et al.  Human microbiome in health and disease. , 2012, Annual review of pathology.

[42]  A. Salner,et al.  Integrated Analysis of Clinical and Microbiome Risk Factors Associated with the Development of Oral Candidiasis during Cancer Chemotherapy , 2019, Journal of fungi.

[43]  A. Kostic,et al.  The microbiome in inflammatory bowel disease: current status and the future ahead. , 2014, Gastroenterology.