Novel metrics for growth model selection

BackgroundLiterature surrounding the statistical modeling of childhood growth data involves a diverse set of potential models from which investigators can choose. However, the lack of a comprehensive framework for comparing non-nested models leads to difficulty in assessing model performance. This paper proposes a framework for comparing non-nested growth models using novel metrics of predictive accuracy based on modifications of the mean squared error criteria.MethodsThree metrics were created: normalized, age-adjusted, and weighted mean squared error (MSE). Predictive performance metrics were used to compare linear mixed effects models and functional regression models. Prediction accuracy was assessed by partitioning the observed data into training and test datasets. This partitioning was constructed to assess prediction accuracy for backward (i.e., early growth), forward (i.e., late growth), in-range, and on new-individuals. Analyses were done with height measurements from 215 Peruvian children with data spanning from near birth to 2 years of age.ResultsFunctional models outperformed linear mixed effects models in all scenarios tested. In particular, prediction errors for functional concurrent regression (FCR) and functional principal component analysis models were approximately 6% lower when compared to linear mixed effects models. When we weighted subject-specific MSEs according to subject-specific growth rates during infancy, we found that FCR was the best performer in all scenarios.ConclusionWith this novel approach, we can quantitatively compare non-nested models and weight subgroups of interest to select the best performing growth model for a particular application or problem at hand.

[1]  Fred L. Collopy,et al.  Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons , 1992 .

[2]  Spyros Makridakis,et al.  Accuracy measures: theoretical and practical concerns☆ , 1993 .

[3]  Simon N. Wood,et al.  Generalized Additive Models for Gigadata: Modeling the U.K. Black Smoke Network Daily Data , 2017 .

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  L. Moulton,et al.  Effects of nutritional status on diarrhea in Peruvian children. , 2002, The Journal of pediatrics.

[6]  Y. Ohashi,et al.  An autoregressive linear mixed effects model for the analysis of longitudinal data which show profiles approaching asymptotes , 2007, Statistics in medicine.

[7]  Huaihou Chen,et al.  A Penalized Spline Approach to Functional Mixed Effects Model Analysis , 2011, Biometrics.

[8]  Yoav Ben-Shlomo,et al.  Is infant weight associated with childhood blood pressure? Analysis of the Promotion of Breastfeeding Intervention Trial (PROBIT) cohort. , 2011, International journal of epidemiology.

[9]  R. Gilman,et al.  First Detected Helicobacter pylori Infection in Infancy Modifies the Association Between Diarrheal Disease and Childhood Growth in Peru , 2014, Helicobacter.

[10]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[11]  C. Chatfield,et al.  Apples, oranges and mean square error , 1988 .

[12]  Luo Xiao,et al.  Dynamic prediction in functional concurrent regression with an application to child growth , 2017, Statistics in medicine.

[13]  Martin Styner,et al.  FMEM: Functional mixed effects modeling for the analysis of longitudinal white matter Tract data , 2014, NeuroImage.

[14]  Jane-ling Wang Nonparametric Regression Analysis of Longitudinal Data , 2005 .

[15]  J. Ramsay,et al.  Principal components analysis of sampled functions , 1986 .

[16]  Mark A. Miller,et al.  Modeling environmental influences on child growth in the MAL-ED cohort study: opportunities and challenges. , 2014, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[17]  J. Ramsay,et al.  Some Tools for Functional Data Analysis , 1991 .

[18]  Hongxiao Zhu,et al.  Robust, Adaptive Functional Regression in Functional Mixed Model Framework , 2011, Journal of the American Statistical Association.

[19]  B. Caffo,et al.  MULTILEVEL FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS. , 2009, The annals of applied statistics.

[20]  L. Epstein,et al.  Effects of acute diarrhea on linear growth in Peruvian children. , 2003, American journal of epidemiology.

[21]  J. A. D. Aston,et al.  Unifying Amplitude and Phase Analysis: A Compositional Data Approach to Functional Multivariate Mixed-Effects Modeling of Mandarin Chinese , 2013, Journal of the American Statistical Association.

[22]  Debbie A Lawlor,et al.  Linear spline multilevel models for summarising childhood growth trajectories: A guide to their application using examples from five birth cohorts , 2013, Statistical methods in medical research.

[23]  L. Skovgaard NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA. , 1996 .

[24]  J Karlberg,et al.  Linear growth retardation in relation to the three phases of growth. , 1994, European journal of clinical nutrition.

[25]  Wensheng Guo,et al.  Functional mixed effects models , 2012, Biometrics.

[26]  R H Gilman,et al.  Effects of Cryptosporidium parvum infection in Peruvian children: growth faltering and subsequent catch-up growth. , 1998, American journal of epidemiology.

[27]  D. Bates,et al.  Nonlinear mixed effects models for repeated measures data. , 1990, Biometrics.

[28]  N. Perumal,et al.  Maternal vitamin D3 supplementation during the third trimester of pregnancy: effects on infant growth in a longitudinal follow-up study in Bangladesh. , 2013, The Journal of pediatrics.

[29]  Philippe Besse,et al.  Simultaneous non-parametric regressions of unbalanced longitudinal data , 1997 .

[30]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[31]  M. Gillman,et al.  Importance of Characterizing Growth Trajectories , 2014, Annals of Nutrition and Metabolism.

[32]  Alan Y. Chiang,et al.  Generalized Additive Models: An Introduction With R , 2007, Technometrics.

[33]  Brian S. Caffo,et al.  Multilevel functional principal component analysis , 2009 .

[34]  J. Scott Armstrong,et al.  Evaluation of Extrapolative Forecasting Methods: Results of a Survey of Academicians and Practitioners , 1982 .

[35]  R. Koenker Quantile Regression: Name Index , 2005 .

[36]  C. Berkey,et al.  A model for describing normal and abnormal growth in early childhood. , 1987, Human biology.

[37]  Alois Kneip,et al.  Nonparametric-estimation of Common Regressors for Similar Curve Data , 1994 .

[38]  Cai Li,et al.  Fast covariance estimation for sparse functional data , 2016, Statistics and Computing.

[39]  Benito E. Flores,et al.  A pragmatic view of accuracy measurement in forecasting , 1986 .

[40]  Naomi S. Altman,et al.  Quantile regression , 2019, Nature Methods.

[41]  Arnab Maity,et al.  Reduced Rank Mixed Effects Models for Spatially Correlated Hierarchical Functional Data , 2010, Journal of the American Statistical Association.

[42]  E Borghi,et al.  Construction of the World Health Organization child growth standards: selection of methods for attained growth curves , 2006, Statistics in medicine.

[43]  R. Baumgartner,et al.  Reference data on gains in weight and length during the first two years of life. , 1991, The Journal of pediatrics.

[44]  Debbie A Lawlor,et al.  Describing differences in weight and length growth trajectories between white and Pakistani infants in the UK: analysis of the Born in Bradford birth cohort study using multilevel linear spline models , 2013, Archives of Disease in Childhood.

[45]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[46]  J. Wingerd The relation of growth from birth to 2 years to sex, parental size and other factors, using Rao's method of the transformed time scale. , 1970, Human biology.

[47]  H. Goldstein,et al.  Efficient statistical modelling of longitudinal data. , 1986, Annals of human biology.

[48]  C. Crainiceanu,et al.  Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines , 2016, Emerging Themes in Epidemiology.

[49]  Ciprian M. Crainiceanu,et al.  Bayesian Analysis for Penalized Spline Regression Using WinBUGS , 2005 .

[50]  D. Nicolae,et al.  Estimating Variance Components in Functional Linear Models With Applications to Genetic Heritability , 2016 .

[51]  Phil Hoole,et al.  Functional linear mixed models for irregularly or sparsely sampled data , 2015, 1508.01686.

[52]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[53]  Luo Xiao,et al.  Fast covariance estimation for high-dimensional functional data , 2013, Stat. Comput..

[54]  Valen E. Johnson,et al.  On the Reproducibility of Psychological Science , 2017, Journal of the American Statistical Association.

[55]  D. Lawlor,et al.  Modelling Childhood Growth Using Fractional Polynomials and Linear Splines , 2014, Annals of Nutrition and Metabolism.