Generalized Multilevel Functional-on-Scalar Regression and Principal Component Analysis

This manuscript considers regression models for generalized, multilevel functional responses: functions are generalized in that they follow an exponential family distribution and multilevel in that they are clustered within groups or subjects. This data structure is increasingly common across scientific domains and is exemplified by our motivating example, in which binary curves indicating physical activity or inactivity are observed for nearly six hundred subjects over five days. We use a generalized linear model to incorporate scalar covariates into the mean structure, and decompose subject-specific and subjectday-specific deviations using multilevel functional principal components analysis. Thus, functional fixed effects are estimated while accounting for within-function and within-subject correlations, and major directions of variability within and between subjects are identified. Fixed effect coefficient functions and principal component basis functions are estimated using penalized splines; model parameters are estimated in a Bayesian framework using Stan, a programming language that implements a Hamiltonian Monte Carlo sampler. Simulations designed to mimic the application indicate good estimation accuracy and inference with reasonable computation times for moderate datasets, in both cross-sectional and multilevel scenarios; code is publicly available. In the application we identify effects of age and BMI on the time-specific change in probability of being active over a twenty-four hour period; in addition, the principal components analysis identifies the patterns of activity that distinguish subjects and days within subjects.

[1]  Jeffrey S. Morris,et al.  AUTOMATED ANALYSIS OF QUANTITATIVE IMAGE DATA USING ISOMORPHIC FUNCTIONAL MIXED MODELS, WITH APPLICATION TO PROTEOMICS DATA. , 2011, The annals of applied statistics.

[2]  Philip T. Reiss,et al.  The International Journal of Biostatistics Fast Function-on-Scalar Regression with Penalized Basis Expansions , 2011 .

[3]  Ci-Ren Jiang,et al.  COVARIATE ADJUSTED FUNCTIONAL PRINCIPAL COMPONENTS ANALYSIS FOR LONGITUDINAL DATA , 2010, 1003.0261.

[4]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[5]  L. Mâsse,et al.  Physical activity in the United States measured by accelerometer. , 2008, Medicine and science in sports and exercise.

[6]  I-Min Lee,et al.  Patterns of accelerometer-assessed sedentary behavior in older women. , 2013, JAMA.

[7]  Jeffrey S. Morris,et al.  Wavelet‐based functional mixed models , 2006, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[8]  Lawrence K. Saul,et al.  A Generalized Linear Model for Principal Component Analysis of Binary Data , 2003, AISTATS.

[9]  J. Goldsmith,et al.  Assessing systematic effects of stroke on motor control by using hierarchical function‐on‐scalar regression , 2016, Journal of the Royal Statistical Society. Series C, Applied statistics.

[10]  Andrew Thomas,et al.  The BUGS project: Evolution, critique and future directions , 2009, Statistics in medicine.

[11]  Hervé Cardot,et al.  Conditional Functional Principal Components Analysis , 2007 .

[12]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[13]  Wensheng Guo,et al.  Functional mixed effects models , 2012, Biometrics.

[14]  Paul Cuddihy,et al.  Maximum Daily 6 Minutes of Activity: An Index of Functional Capacity Derived from Actigraphy and Its Application to Older Adults with Heart Failure , 2010, Journal of the American Geriatrics Society.

[15]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[16]  Evangelos Pappas,et al.  A comparison of energy expenditure estimates from the Actiheart and Actical physical activity monitors during low intensity activities, walking, and jogging , 2010, European Journal of Applied Physiology.

[17]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[18]  Stewart G Trost,et al.  Conducting accelerometer-based activity assessments in field-based research. , 2005, Medicine and science in sports and exercise.

[19]  Kelly R Evenson,et al.  Accelerometer use in physical activity: best practices and research recommendations. , 2005, Medicine and science in sports and exercise.

[20]  Václav Smídl,et al.  On Bayesian principal component analysis , 2007, Comput. Stat. Data Anal..

[21]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[22]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[23]  C. Crainiceanu,et al.  Corrected Confidence Bands for Functional Data Using Principal Components , 2013, Biometrics.

[24]  Wensheng Guo Functional Mixed Effects Models , 2002 .

[25]  Ciprian M. Crainiceanu,et al.  Bayesian Analysis for Penalized Spline Regression Using WinBUGS , 2005 .

[26]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[27]  S. Rubin,et al.  Cognitive function, gait speed decline, and comorbidities: the health, aging and body composition study. , 2007, The journals of gerontology. Series A, Biological sciences and medical sciences.

[28]  Brian S. Caffo,et al.  Multilevel functional principal component analysis , 2009 .

[29]  B. Mallick,et al.  Bayesian Hierarchical Spatially Correlated Functional Data Analysis with Application to Colon Carcinogenesis , 2008, Biometrics.

[30]  P. Hall,et al.  Properties of principal component methods for functional and longitudinal data analysis , 2006, math/0608022.

[31]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[32]  A. Linde A Bayesian latent variable approach to functional principal components analysis with binary and count data , 2009 .

[33]  R. Mellins,et al.  Physical activity and asthma symptoms among New York City Head Start Children. , 2009, The Journal of asthma : official journal of the Association for the Care of Asthma.

[34]  F. Scheipl Additive Mixed Models for Correlated Functional Data , 2012 .

[35]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[36]  J. Rice,et al.  Smoothing spline models for the analysis of nested and crossed samples of curves , 1998 .

[37]  Angelika van der Linde,et al.  Variational Bayesian functional PCA , 2008, Comput. Stat. Data Anal..

[38]  M. Wand,et al.  Semiparametric Regression: Parametric Regression , 2003 .