Log-Contrast Regression with Functional Compositional Predictors: Linking Preterm Infant's Gut Microbiome Trajectories in Early Postnatal Period to Neurobehavioral Outcome

When compositional data serve as predictors in regression, the log-contrast model is commonly applied. A prominent feature of the model is that it complies with the simplex geometry and enables the regression analysis to have various desirable invariance properties. Motivated by the needs in understanding how the trajectories of gut microbiome compositions during early postnatal stage impact later neurobehavioral outcomes among preterm infants, we develop a sparse log-contrast regression with functional compositional predictors. The functional simplex structure is preserved by a set of zero-sum constraints on the parameters, and the compositional predictors are allowed to have sparse, smoothly varying, and accumulating effects on the outcome through time. Through basis expansion, the problem boils down to a linearly constrained group lasso regression, for which we develop an efficient augmented Lagrangian algorithm and obtain theoretical performance guarantees. The proposed approach yields interesting results in the preterm infant study. The identified microbiome markers and the estimated time dynamics of their impact on the neurobehavioral outcome shed lights on the functional linkage between stress accumulation in early postnatal stage and neurodevelpomental process of infants.

[1]  Jian Huang,et al.  Consistent group selection in high-dimensional linear regression. , 2010, Bernoulli : official journal of the Bernoulli Society for Mathematical Statistics and Probability.

[2]  J. Graf,et al.  Influence of Feeding Type on Gut Microbiome Development in Hospitalized Preterm Infants , 2017, Nursing research.

[3]  A. Lodha,et al.  Neonatal outcomes of extremely preterm infants exposed to maternal hypertension and cigarette smoking , 2018, Journal of Perinatology.

[4]  Jeffrey S. Morris Functional Regression , 2014, 1406.4068.

[5]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[6]  E. Murphy,et al.  The gut microbiota and its relationship to diet and obesity , 2012, Gut microbes.

[7]  J. Graf,et al.  Directed Culturing of Microorganisms Using Metatranscriptomics , 2011, mBio.

[8]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[9]  M. Walsh,et al.  The NICHD neonatal research network: changes in practice and outcomes during the first 15 years. , 2003, Seminars in perinatology.

[10]  Anru R. Zhang,et al.  Regression Analysis for Microbiome Compositional Data , 2016, 1603.00974.

[11]  J. Nicholson,et al.  Impact of the gut microbiota on inflammation, obesity, and metabolic disease , 2016, Genome Medicine.

[12]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[13]  T. Dinan,et al.  Regulation of the stress response by the gut microbiota: Implications for psychoneuroendocrinology , 2012, Psychoneuroendocrinology.

[14]  S. Geer,et al.  Oracle Inequalities and Optimal Inference under Group Sparsity , 2010, 1007.1771.

[15]  J. Lawn,et al.  Long-term neurodevelopmental outcomes after intrauterine and neonatal insults: a systematic review , 2012, The Lancet.

[16]  B. Poindexter,et al.  Neonatal Outcomes of Extremely Preterm Infants From the NICHD Neonatal Research Network , 2010, Pediatrics.

[17]  J. Aitchison,et al.  Log contrast models for experiments with mixtures , 1984 .

[18]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[19]  J. Graf,et al.  Gut Microbiome Developmental Patterns in Early Life of Preterm Infants: Impacts of Feeding and Gender , 2016, PloS one.

[20]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[21]  Hongzhe Li,et al.  Variable selection in regression with compositional covariates , 2014 .

[22]  Yingying Fan,et al.  Tuning parameter selection in high dimensional penalized likelihood , 2013, 1605.03321.

[23]  Jian Huang,et al.  A Selective Review of Group Selection in High-Dimensional Models. , 2012, Statistical science : a review journal of the Institute of Mathematical Statistics.

[24]  M. Stone Cross-validation and multinomial prediction , 1974 .

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .