Longitudinal Principal Component Analysis With an Application to Marketing Data

Abstract We propose a longitudinal principal component analysis method for multivariate longitudinal data using a random-effects eigen-decomposition, where the eigen-decomposition uses longitudinal information through nonparametric splines and the multivariate random effects incorporate significant store-wise heterogeneity. Our method can effectively analyze large marketing data containing sales information for products from hundreds of stores over an 11-year time period. The proposed method leads to more accurate estimation and interpretation compared to existing approaches. We illustrate our method through simulation studies and an application to marketing data from IRI. Supplementary materials for this article are available online.

[1]  Md Nazmul Islam,et al.  Longitudinal dynamic functional regression , 2016, Journal of the Royal Statistical Society. Series C, Applied statistics.

[2]  Peter S. Fader,et al.  Accounting for Heterogeneity and Nonstationarity in a Cross-Sectional Model of Consumer Purchase Behavior , 1993 .

[3]  Jeng-Min Chiou,et al.  Multivariate functional principal component analysis: A normalization approach , 2014 .

[4]  Peng Wang,et al.  Conditional Inference Functions for Mixed-Effects Models With Unspecified Random-Effects Distribution , 2012 .

[5]  P. Hall,et al.  Properties of principal component methods for functional and longitudinal data analysis , 2006, math/0608022.

[6]  M. Pourahmadi,et al.  Nonparametric estimation of large covariance matrices of longitudinal data , 2003 .

[7]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[8]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[9]  Runze Li,et al.  Analysis of Longitudinal Data With Semiparametric Estimation of Covariance Function , 2007, Journal of the American Statistical Association.

[10]  Selecting the right number of knots for B-spline parameterization of the dielectric functions in spectroscopic ellipsometry data analysis , 2017 .

[11]  Ci-Ren Jiang,et al.  COVARIATE ADJUSTED FUNCTIONAL PRINCIPAL COMPONENTS ANALYSIS FOR LONGITUDINAL DATA , 2010, 1003.0261.

[12]  Ronghui Xu,et al.  A joint marginal‐conditional model for multivariate longitudinal data , 2018, Statistics in medicine.

[13]  Hua Liang,et al.  Penalized Splines For Longitudinal Data With An Application In AIDS Studies , 2006 .

[14]  Jianqing Fan,et al.  Semiparametric Estimation of Covariance Matrixes for Longitudinal Data , 2008, Journal of the American Statistical Association.

[15]  Ci-Ren Jiang Covariate adjusted functional principal component analysis , 2009 .

[16]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[17]  S. Greven,et al.  Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains , 2015, 1509.02029.

[18]  Brian Caffo,et al.  Longitudinal functional principal component analysis. , 2010, Electronic journal of statistics.

[19]  Zhehui Luo,et al.  Fixed effects, random effects and GEE: What are the differences? , 2009, Statistics in medicine.

[20]  Jeffrey S. Morris Functional Regression , 2014, 1406.4068.

[21]  A. Qu,et al.  Consistent Model Selection for Marginal Generalized Additive Model for Correlated Data , 2010 .

[22]  B. Lindsay,et al.  Improving generalised estimating equations using quadratic inference functions , 2000 .

[23]  Jianhua Z. Huang,et al.  Robust estimation of the correlation matrix of longitudinal data , 2013, Stat. Comput..

[24]  Eric T. Bradlow Exploring repeated measures data sets for key features using Principal Components Analysis , 2002 .

[25]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[26]  Lan Xue,et al.  Variable Selection in High-dimensional Varying-coefficient Models with Global Optimality , 2012, J. Mach. Learn. Res..

[27]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[28]  David B. Dunson,et al.  Bayesian nonparametric covariance regression , 2011, J. Mach. Learn. Res..

[29]  David Berger,et al.  Accounting for Heterogeneity , 2018 .

[30]  Claire E. Miller,et al.  Smooth principal components for investigating changes in covariances over time , 2012 .

[31]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[32]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[33]  H. Tong,et al.  Estimation of the covariance matrix of random effects in longitudinal studies , 2007, 0803.4112.

[34]  Peter D. Hoff,et al.  A Covariance Regression Model , 2011, 1102.5721.

[35]  R H Jones,et al.  Smoothing splines for longitudinal data. , 1995, Statistics in medicine.

[36]  kwang-yul kim,et al.  A Comparison Study of EOF Techniques: Analysis of Nonstationary Data with Periodic Statistics , 1999 .

[37]  B. Caffo,et al.  MULTILEVEL FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS. , 2009, The annals of applied statistics.

[38]  Gerda Claeskens,et al.  Nonparametric Estimation , 2011, International Encyclopedia of Statistical Science.

[39]  Tania Prvan,et al.  Nonparametric time dependent principal components analysis. , 2003 .

[40]  I. Jolliffe Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[41]  Dipak C. Jain,et al.  Estimation of Latent Class Models with Heterogeneous Choice Probabilities: An Application to Market Structuring , 1990 .

[42]  Brian Caffo,et al.  Longitudinal High-Dimensional Principal Components Analysis with Application to Diffusion Tensor Imaging of Multiple Sclerosis. , 2015, The annals of applied statistics.

[43]  Jianhua Z. Huang,et al.  Polynomial Spline Estimation and Inference for Varying Coefficient Models with Longitudinal Data , 2003 .

[44]  A. Qu,et al.  Cluster analysis of longitudinal profiles with subgroups , 2018 .

[45]  Marcela Svarc,et al.  Principal components for multivariate functional data , 2011 .

[46]  George Casella,et al.  EM Algorithm for Estimating Equations , 1998 .

[47]  M. Wand,et al.  Simple fitting of subject‐specific curves for longitudinal data , 2005, Statistics in medicine.

[48]  Peter E. Rossi,et al.  A Bayesian Approach to Estimating Household Parameters , 1993 .

[49]  Hua Liang,et al.  Polynomial Spline Estimation for a Generalized Additive Coefficient Model , 2010, Scandinavian journal of statistics, theory and applications.

[50]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[51]  Bart J. Bronnenberg,et al.  Database Paper - The IRI Marketing Data Set , 2008, Mark. Sci..

[52]  P J Diggle,et al.  Nonparametric estimation of covariance structure in longitudinal data. , 1998, Biometrics.

[53]  D. Ruppert Selecting the Number of Knots for Penalized Splines , 2002 .

[54]  Xueying Zheng,et al.  Time-varying correlation structure estimation and local-feature detection for spatio-temporal data , 2018, J. Multivar. Anal..