Zero-inflated negative binomial mixed model: an application to two microbial organisms important in oesophagitis

SUMMARY Altered microbial communities are thought to play an important role in eosinophilic oesophagitis, an allergic inflammatory condition of the oesophagus. Identification of the majority of organisms present in human-associated microbial communities is feasible with the advent of high throughput sequencing technology. However, these data consist of non-negative, highly skewed sequence counts with a large proportion of zeros. In addition, hierarchical study designs are often performed with repeated measurements or multiple samples collected from the same subject, thus requiring approaches to account for within-subject variation, yet only a small number of microbiota studies have applied hierarchical regression models. In this paper, we describe and illustrate the use of a hierarchical regression-based approach to evaluate multiple factors for a small number of organisms individually. More specifically, the zero-inflated negative binomial mixed model with random effects in both the count and zero-inflated parts is applied to evaluate associations with disease state while adjusting for potential confounders for two organisms of interest from a study of human microbiota sequence data in oesophagitis.

[1]  Sandrine Dudoit,et al.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments , 2010, BMC Bioinformatics.

[2]  Roberto Romero,et al.  The composition and stability of the vaginal microbiota of normal pregnant women is different from that of non-pregnant women , 2014, Microbiome.

[3]  W. Greene,et al.  Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models , 1994 .

[4]  Xiao Wu,et al.  Comparative analysis of microbiome measurement platforms using latent variable structural equation modeling , 2013, BMC Bioinformatics.

[5]  K. Yau,et al.  Zero‐Inflated Negative Binomial Mixed Regression Modeling of Over‐Dispersed Count Data with Extra Zeros , 2003 .

[6]  Abbas Moghimbeigi,et al.  Multilevel zero-inflated negative binomial regression modeling for over-dispersed count data with extra zeros , 2008 .

[7]  N. Pace,et al.  Esophageal Microbiome in Eosinophilic Esophagitis , 2015, PloS one.

[8]  F. Bäckhed,et al.  The gut microbiota — masters of host development and physiology , 2013, Nature Reviews Microbiology.

[9]  Donald Hedeker,et al.  Modeling Clustered Count Data with Excess Zeros in Health Care Outcomes Research , 2002, Health Services & Outcomes Research Methodology.

[10]  M. Pop,et al.  Metagenomic Analysis of the Human Distal Gut Microbiome , 2006, Science.

[11]  Lu Wang,et al.  The NIH Human Microbiome Project. , 2009, Genome research.

[12]  J. Bakken,et al.  Recurrent Clostridium difficile colitis: case series involving 18 patients treated with donor stool administered via a nasogastric tube. , 2003, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[13]  M. Pop,et al.  Robust methods for differential abundance analysis in marker gene surveys , 2013, Nature Methods.

[14]  P. Castle,et al.  The Cervical Microbiome over 7 Years and a Comparison of Methodologies for Its Characterization , 2012, PloS one.

[15]  Susan P. Holmes,et al.  Waste Not , Want Not : Why Rarefying Microbiome Data is Inadmissible . October 1 , 2013 , 2013 .

[16]  Andy H. Lee,et al.  Zero‐inflated Poisson regression with random effects to evaluate an occupational injury prevention programme , 2001, Statistics in medicine.

[17]  Thomas J. Hardcastle,et al.  baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data , 2010, BMC Bioinformatics.

[18]  N. Pace,et al.  Novel Device to Sample the Esophageal Microbiome—The Esophageal String Test , 2012, PloS one.

[19]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[20]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[21]  P. Hemarajata,et al.  The human gut microbiome and body metabolism: implications for obesity and diabetes. , 2013, Clinical chemistry.

[22]  F. Bushman,et al.  Inflammation-associated microbiota in pediatric eosinophilic esophagitis , 2015, Microbiome.