Modeling strategies in longitudinal data analysis: Covariate, variance function and correlation structure selection

A modeling paradigm is proposed for covariate, variance and working correlation structure selection for longitudinal data analysis. Appropriate selection of covariates is pertinent to correct variance modeling and selecting the appropriate covariates and variance function is vital to correlation structure selection. This leads to a stepwise model selection procedure that deploys a combination of different model selection criteria. Although these criteria find a common theoretical root based on approximating the Kullback-Leibler distance, they are designed to address different aspects of model selection and have different merits and limitations. For example, the extended quasi-likelihood information criterion (EQIC) with a covariance penalty performs well for covariate selection even when the working variance function is misspecified, but EQIC contains little information on correlation structures. The proposed model selection strategies are outlined and a Monte Carlo assessment of their finite sample properties is reported. Two longitudinal studies are used for illustration.

[1]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[2]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[3]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[4]  K. Cheng,et al.  Testing Goodness of Fit for a Parametric Family of Link Functions , 1994 .

[5]  Elvezio Ronchetti,et al.  Variable Selection for Marginal Longitudinal Generalized Linear Models , 2003, Biometrics.

[6]  You‐Gan Wang,et al.  Effects of Variance‐Function Misspecification in Analysis of Longitudinal Data , 2005, Biometrics.

[7]  V. Carey,et al.  Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance , 2003 .

[8]  P. Thall,et al.  Some covariance models for longitudinal count data with overdispersion. , 1990, Biometrics.

[9]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[10]  Jennifer A Hoeting,et al.  Model selection for geostatistical models. , 2006, Ecological applications : a publication of the Ecological Society of America.

[11]  Masanobu Taniguchi,et al.  GENERALIZED INFORMATION CRITERIA IN MODEL SELECTION FOR LOCALLY STATIONARY PROCESSES , 2008 .

[12]  N. Rao Chaganty,et al.  Efficiency of generalized estimating equations for binary responses , 2004 .

[13]  J. Nelder,et al.  An extended quasi-likelihood function , 1987 .

[14]  J. Hardin,et al.  Generalized Estimating Equations , 2002 .

[15]  You-Gan Wang,et al.  Working‐correlation‐structure identification in generalized estimating equations , 2009, Statistics in medicine.

[16]  John A. Nelder,et al.  Likelihood, Quasi-likelihood and Pseudolikelihood: Some Comparisons , 1992 .

[17]  Gilbert MacKenzie,et al.  On modelling mean‐covariance structures in longitudinal studies , 2003 .

[18]  B. Efron The Estimation of Prediction Error , 2004 .

[19]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[20]  B. Lindsay,et al.  Improving generalised estimating equations using quadratic inference functions , 2000 .

[21]  David R. Cox,et al.  Edgeworth and Saddle‐Point Approximations with Statistical Applications , 1979 .

[22]  W. Pan Akaike's Information Criterion in Generalized Estimating Equations , 2001, Biometrics.

[23]  Guoqi Qian,et al.  Selection of Working Correlation Structure and Best Model in GEE Analyses of Longitudinal Data , 2007, Commun. Stat. Simul. Comput..

[24]  V. Carey,et al.  Criteria for Working–Correlation–Structure Selection in GEE , 2007 .

[25]  R. Carroll,et al.  Variance Function Estimation , 1987 .

[26]  M. Pourahmadi Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation , 1999 .

[27]  James Cui,et al.  QIC Program and Model Selection in GEE Analyses , 2007 .

[28]  B. Efron How Biased is the Apparent Error Rate of a Prediction Rule , 1986 .