Analysis of multiple exposures in the case‐crossover design via sparse conditional likelihood

We adapt the least absolute shrinkage and selection operator (lasso) and other sparse methods (elastic net and bootstrapped versions of lasso) to the conditional logistic regression model and provide a full R implementation. These variable selection procedures are applied in the context of case-crossover studies. We study the performances of conventional and sparse modelling strategies by simulations, then empirically compare results of these methods on the analysis of the association between exposure to medicinal drugs and the risk of causing an injurious road traffic crash in elderly drivers. Controlling the false discovery rate of lasso-type methods is still problematic, but this problem is also present in conventional methods. The sparse methods have the ability to provide a global analysis of dependencies, and we conclude that some of the variants compared here are valuable tools in the context of case-crossover studies with a large number of variables.

[1]  Christophe Ambroise,et al.  Parsimonious additive models , 2007, Comput. Stat. Data Anal..

[2]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[3]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[4]  Kurt Hornik,et al.  On the generation of correlated artificial binary data , 1998 .

[5]  Benedikt M. Potscher,et al.  Confidence sets based on penalized maximum likelihood estimators in Gaussian regression , 2008, 0806.1652.

[6]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[7]  M. Maclure The case-crossover design: a method for studying transient effects on the risk of acute events. , 1991, American journal of epidemiology.

[8]  Ryan E Wiegand,et al.  Performance of using multiple stepwise algorithms for variable selection , 2010, Statistics in medicine.

[9]  Samy Suissa,et al.  Benzodiazepines and elderly drivers: a comparison of pharmacoepidemiological study designs , 2007, Pharmacoepidemiology and drug safety.

[10]  C. Raghavendra Rao,et al.  On model selection , 2001 .

[11]  Uc San Francisco,et al.  Microarray Gene Expression Data with Linked Survival Phenotypes: Diffuse Large-B-Cell Lymphoma Revisited , 2005 .

[12]  Ludivine Orriols,et al.  Prescription Medicines and the Risk of Road Traffic Crashes: A French Registry-Based Study , 2010, PLoS medicine.

[13]  R. Tibshirani,et al.  On the “degrees of freedom” of the lasso , 2007, 0712.0881.

[14]  Edward H. Adelson,et al.  Microgeometry capture using an elastomeric sensor , 2011, SIGGRAPH 2011.

[15]  Adrian Barbu,et al.  Dimension reduction and variable selection in case control studies via regularized likelihood optimization , 2009, 0905.2171.

[16]  Walter Zucchini,et al.  Model Selection , 2011, International Encyclopedia of Statistical Science.

[17]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[18]  S. N. Lahiri,et al.  Asymptotic properties of the residual bootstrap for Lasso estimators , 2010 .

[19]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[20]  T Jaki,et al.  Direct effects testing: A two‐stage procedure to test for effect size and variable importance for correlated binary predictors and a binary response , 2010, Statistics in medicine.

[21]  Emmanuel Lagarde,et al.  Association between road vehicle collisions and recent medical contact in older drivers: a case-crossover study , 2007, Injury Prevention.

[22]  Samy Suissa,et al.  Warfarin use and the risk of motor vehicle crash in older drivers. , 2006, British journal of clinical pharmacology.

[23]  Sijian Wang,et al.  RANDOM LASSO. , 2011, The annals of applied statistics.

[24]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[25]  Yang Jing L1 Regularization Path Algorithm for Generalized Linear Models , 2008 .

[26]  R. Sims,et al.  Relations among chronic medical conditions, medications, and automobile crashes in the elderly: a population-based case-control study. , 2000, American journal of epidemiology.

[27]  Sander Greenland,et al.  Invited commentary: variable selection versus shrinkage in the control of multiple confounders. , 2007, American journal of epidemiology.

[28]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[29]  Peter C Austin,et al.  Using the bootstrap to improve estimation and confidence intervals for regression coefficients selected using backwards variable elimination , 2008, Statistics in medicine.

[30]  Anders Engeland,et al.  Risk of road traffic accidents associated with the prescription of drugs: a registry-based cohort study. , 2007, Annals of epidemiology.

[31]  Marek Kimmel,et al.  Stochastic search gene suggestion: a Bayesian hierarchical model for gene mapping. , 2006, Biometrics.

[32]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[33]  H. Tiemeier,et al.  Variable selection: current practice in epidemiological studies , 2009, European Journal of Epidemiology.

[34]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[35]  G. Wahba,et al.  A NOTE ON THE LASSO AND RELATED PROCEDURES IN MODEL SELECTION , 2006 .

[36]  R Lefrançois,et al.  Exposure and risk factors among elderly drivers: a case-control study. , 1997, Accident; analysis and prevention.

[37]  Jiang Gui,et al.  Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data , 2005, Bioinform..

[38]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[39]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[40]  Francis R. Bach,et al.  Bolasso: model consistent Lasso estimation through the bootstrap , 2008, ICML '08.

[41]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[42]  J. Goeman L1 Penalized Estimation in the Cox Proportional Hazards Model , 2009, Biometrical journal. Biometrische Zeitschrift.

[43]  M Maclure,et al.  Should we use a case-crossover design? , 2000, Annual review of public health.

[44]  N. Breslow,et al.  The analysis of case-control studies , 1980 .

[45]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[46]  L. V. van't Veer,et al.  Cross‐validated Cox regression on microarray gene expression data , 2006, Statistics in medicine.

[47]  F. Bunea Honest variable selection in linear and logistic regression models via $\ell_1$ and $\ell_1+\ell_2$ penalization , 2008, 0808.4051.

[48]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[49]  Clifford M. Hurvich,et al.  The impact of model selection on inference in linear regression , 1990 .