gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework

Generalized additive models for location, scale and shape (GAMLSS) are a flexible class of regression models that allow to model multiple parameters of a distribution function, such as the mean and the standard deviation, simultaneously. With the R package gamboostLSS, we provide a boosting method to fit these models. Variable selection and model choice are naturally available within this regularized regression framework. To introduce and illustrate the R package gamboostLSS and its infrastructure, we use a data set on stunted growth in India. In addition to the specification and application of the model itself, we present a variety of convenience functions, including methods for tuning parameter selection, prediction and visualization of results. The package gamboostLSS is available from CRAN (this http URL).

[1]  Chris A Glasbey,et al.  A comparison of parametric and nonparametric methods for normalising cDNA microarray data. , 2007, Biometrical journal. Biometrische Zeitschrift.

[2]  Adrian Bowman,et al.  Generalized additive models for location, scale and shape - Discussion , 2005 .

[3]  C. Monteiro,et al.  The worldwide magnitude of protein-energy malnutrition: an overview from the WHO Global Database on Child Growth. , 1993, Bulletin of the World Health Organization.

[4]  Holger Schwender,et al.  Statistical Applications in Genetics and Molecular Biology Cluster-Localized Sparse Logistic Regression for SNP Data , 2012 .

[5]  R. Rigby,et al.  Automatic smoothing parameter selection in GAMLSS with an application to centile estimation , 2014, Statistical methods in medical research.

[6]  R. Rigby,et al.  Generalized Additive Models for Location Scale and Shape (GAMLSS) in R , 2007 .

[7]  Francesco Serinaldi,et al.  A modular class of multisite monthly rainfall generators for water resource management and impact studies , 2012 .

[8]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[9]  G. Heller,et al.  Long-range forecasting of intermittent streamflow , 2011 .

[10]  Torsten Hothorn,et al.  A unified framework of constrained regression , 2014, Stat. Comput..

[11]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[12]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[13]  S. Kitzinger,et al.  Pregnancy and Childbirth , 1989 .

[14]  Torsten Hothorn,et al.  Prediction intervals for future BMI values of individual children - a non-parametric approach by quantile boosting , 2012, BMC Medical Research Methodology.

[15]  Trevor Hastie Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting , 2007 .

[16]  M. Schmid,et al.  The Importance of Knowing When to Stop , 2012, Methods of Information in Medicine.

[17]  Torsten Hothorn,et al.  Identifying Risk Factors for Severe Childhood Malnutrition by Boosting Additive Quantile Regression , 2011 .

[18]  Matthias Schmid,et al.  Boosted Beta Regression , 2013, PloS one.

[19]  R. Rigby,et al.  Generalized additive models for location, scale and shape , 2005 .

[20]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[21]  J. Klein,et al.  Survival Analysis: Techniques for Censored and Truncated Data , 1997 .

[22]  A Mayr,et al.  The Evolution of Boosting Algorithms , 2014, Methods of Information in Medicine.

[23]  Benjamin Hofner,et al.  Model-based boosting in R: a hands-on tutorial using the R package mboost , 2012, Computational Statistics.

[24]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[25]  A Ziegler,et al.  Discussion of “The Evolution of Boosting Algorithms” and “Extending Statistical Boosting” , 2014, Methods of Information in Medicine.

[26]  Ludwig Fahrmeir,et al.  Bayesian smoothing and regression for longitudinal, spatial and event history data / Ludwig Fahrmeir , 2011 .

[27]  Torsten Hothorn,et al.  Model-based Boosting 2.0 , 2010, J. Mach. Learn. Res..

[28]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[29]  Perianayagam Arokiasamy,et al.  Nutrition in India , 1940, Lancet.

[30]  Benjamin Hofner,et al.  Boosting in structured additive models , 2011 .

[31]  Jörg Müller,et al.  Monotonicity-constrained species distribution models. , 2011, Ecology.

[32]  Benjamin Hofner,et al.  Generalized additive models for location, scale and shape for high dimensional data—a flexible approach based on boosting , 2012 .

[33]  Torsten Hothorn,et al.  Estimation and regularization techniques for regression models with multidimensional prediction functions , 2010, Stat. Comput..

[34]  Gerhard Tutz,et al.  Variable Selection and Model Choice in Geoadditive Regression Models , 2009, Biometrics.

[35]  T. Hothorn,et al.  Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression , 2013, PloS one.

[36]  B Blanc,et al.  [Pregnancy and childbirth in gypsies]. , 1985, Revue francaise de gynecologie et d'obstetrique.

[37]  P. Bates,et al.  Flood frequency analysis for nonstationary annual peak records in an urban drainage basin , 2009 .

[38]  E Borghi,et al.  Construction of the World Health Organization child growth standards: selection of methods for attained growth curves , 2006, Statistics in medicine.

[39]  Ruby Jose,et al.  New birth weight reference standards customised to birth order and sex of babies from South India , 2013, BMC Pregnancy and Childbirth.

[40]  Achim Zeileis,et al.  Estimate Structured Additive Regression Models with BayesX , 2015 .

[41]  Mercedes Onis,et al.  WHO Child Growth Standards based on length/height, weight and age , 2006, Acta paediatrica (Oslo, Norway : 1992). Supplement.

[42]  Thomas Kneib,et al.  Geoadditive expectile regression , 2012, Comput. Stat. Data Anal..