PGEE: An R Package for Analysis of Longitudinal Data with High-Dimensional Covariates

We introduce an R package PGEE that implements the penalized generalized estimating equations (GEE) procedure proposed by Wang et al. (2012) to analyze longitudinal data with a large number of covariates. The PGEE package includes three main functions: CVfit, PGEE, and MGEE. The CVfit function computes the cross-validated tuning parameter for penalized generalized estimating equations. The function PGEE performs simultaneous estimation and variable selection for longitudinal data with high-dimensional covariates; whereas the function MGEE fits unpenalized GEE to the data for comparison. The R package PGEE is illustrated using a yeast cell-cycle gene expression data set.

[1]  Søren Højsgaard,et al.  The R Package geepack for Generalized Estimating Equations , 2005 .

[2]  Heng Lian,et al.  Generalized additive partial linear models for clustered data with diverging number of covariates using gee , 2014 .

[3]  A. Qu,et al.  Estimation and model selection in generalized additive partial linear models for correlated data with diverging number of covariates , 2014, 1405.6030.

[4]  J. Goeman L1 Penalized Estimation in the Cox Proportional Hazards Model , 2009, Biometrical journal. Biometrische Zeitschrift.

[5]  Stuart R. Lipsitz,et al.  Review of Software to Fit Generalized Estimating Equation Regression Models , 1999 .

[6]  T. Hothorn,et al.  Multivariate Normal and t Distributions , 2016 .

[7]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[8]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[9]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[10]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[11]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[12]  J. Hardin,et al.  Generalized Estimating Equations , 2002 .

[13]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[14]  Lan Wang,et al.  GEE analysis of clustered binary data with diverging number of covariates , 2011, 1103.1795.

[15]  Annie Qu,et al.  Penalized Generalized Estimating Equations for High‐Dimensional Longitudinal Data Analysis , 2012, Biometrics.

[16]  Hongzhe Li,et al.  Group SCAD regression analysis for microarray time course gene expression data , 2007, Bioinform..

[17]  Brent A. Johnson,et al.  Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models , 2008, Journal of the American Statistical Association.

[18]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[19]  Annie Qu,et al.  MODEL SELECTION FOR CORRELATED DATA WITH DIVERGING NUMBER OF PARAMETERS , 2013 .