A Scalable Empirical Bayes Approach to Variable Selection in Generalized Linear Models

Abstract A new empirical Bayes approach to variable selection in the context of generalized linear models is developed. The proposed algorithm scales to situations in which the number of putative explanatory variables is very large, possibly much larger than the number of responses. The coefficients in the linear predictor are modeled as a three-component mixture allowing the explanatory variables to have a random positive effect on the response, a random negative effect, or no effect. A key assumption is that only a small (but unknown) fraction of the candidate variables have a nonzero effect. This assumption, in addition to treating the coefficients as random effects facilitates an approach that is computationally efficient. In particular, the number of parameters that have to be estimated is small, and remains constant regardless of the number of explanatory variables. The model parameters are estimated using a generalized alternating maximization algorithm which is scalable, and leads to significantly faster convergence compared with simulation-based fully Bayesian methods. Supplementary materials for this article are available online.

[1]  Yang Feng,et al.  SIS: An R Package for Sure Independence Screening in Ultrahigh-Dimensional Statistical Models , 2018 .

[2]  Malay Ghosh,et al.  The Inverse Gamma-Gamma Prior for Optimal Posterior Contraction and Multiple Hypothesis Testing , 2017, 1710.04369.

[3]  Johannes Schmidt-Hieber,et al.  Conditions for Posterior Contraction in the Sparse Normal Means Problem , 2015, 1510.02232.

[4]  Stephen J. Wright Coordinate descent algorithms , 2015, Mathematical Programming.

[5]  Nicholas G. Polson,et al.  The Horseshoe+ Estimator of Ultra-Sparse Signals , 2015, 1502.00560.

[6]  Hongzhe Li,et al.  Variable selection in regression with compositional covariates , 2014 .

[7]  C. Carvalho,et al.  Decoupling Shrinkage and Selection in Bayesian Linear Models: A Posterior Summary Perspective , 2014, 1408.0464.

[8]  Weijie J. Su,et al.  SLOPE-ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION. , 2014, The annals of applied statistics.

[9]  Veronika Rockova,et al.  EMVS: The EM Approach to Bayesian Variable Selection , 2014 .

[10]  Christian L. Müller,et al.  Don't Fall for Tuning Parameters: Tuning-Free Variable Selection in High Dimensions With the TREX , 2014, AAAI.

[11]  Peter Bühlmann,et al.  High-Dimensional Statistics with a View Toward Applications in Biology , 2014 .

[12]  Ling-Hui Li,et al.  Tumor suppressor SCUBE2 inhibits breast-cancer cell migration and invasion through the reversal of epithelial–mesenchymal transition , 2014, Journal of Cell Science.

[13]  Martin Clynes,et al.  BreastMark: An Integrated Approach to Mining Publicly Available Transcriptomic Datasets Relating to Breast Cancer Outcome , 2013, Breast Cancer Research.

[14]  V. Johnson,et al.  Bayesian Model Selection in High-Dimensional Settings , 2012, Journal of the American Statistical Association.

[15]  M. Stephens,et al.  Bayesian variable selection regression for genome-wide association studies and other large-scale problems , 2011, 1110.6019.

[16]  F. Bushman,et al.  Linking Long-Term Dietary Patterns with Gut Microbial Enterotypes , 2011, Science.

[17]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[18]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[19]  V. Johnson,et al.  On the use of non‐local prior densities in Bayesian hypothesis tests , 2010 .

[20]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[21]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[22]  J. Goeman L1 Penalized Estimation in the Cox Proportional Hazards Model , 2009, Biometrical journal. Biometrische Zeitschrift.

[23]  Jiahua Chen,et al.  Hypothesis test for normal mixture models: The EM approach , 2009, 0908.3428.

[24]  김동일,et al.  LARS(Least Angle Regression)와 유전알고리즘을 결합한 변수 선택 알고리즘 , 2009 .

[25]  Y. Benjamini,et al.  A simple forward selection procedure based on false discovery rate control , 2009, 0905.2819.

[26]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[27]  Parantu K. Shah,et al.  Genomic analysis of estrogen cascade reveals histone variant H2A.Z associated with breast cancer progression , 2008, Molecular systems biology.

[28]  T. Hesterberg,et al.  Least angle and ℓ1 penalized regression: A review , 2008, 0802.0964.

[29]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[30]  A. Nobel,et al.  Concordance among gene-expression-based predictors for breast cancer. , 2006, The New England journal of medicine.

[31]  G. Casella,et al.  Objective Bayesian Variable Selection , 2006 .

[32]  William J. Byrne,et al.  Convergence Theorems for Generalized Alternating Minimization Procedures , 2005, J. Mach. Learn. Res..

[33]  Andrew G. Clark,et al.  Mapping Multiple Quantitative Trait Loci by Bayesian Classification , 2005, Genetics.

[34]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[35]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[36]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[37]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[38]  S. Saha,et al.  RNA Expression Analysis Using an AntisenseBacillus subtilis Genome Array , 2001, Journal of bacteriology.

[39]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[40]  G. McLachlan,et al.  Finite Mixture Models , 2000, Wiley Series in Probability and Statistics.

[41]  R. Wolfinger,et al.  Generalized linear mixed models a pseudo-likelihood approach , 1993 .

[42]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[43]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[44]  R. Schall Estimation in generalized linear models with random effects , 1991 .

[45]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[46]  J. Whitehead Fitting Cox's Regression Model to Survival Data Using Glim , 1980 .

[47]  I. James,et al.  Linear regression with censored data , 1979 .

[48]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[49]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[50]  E. M. L. Beale,et al.  Nonlinear Programming: A Unified Approach. , 1970 .

[51]  Lorraine O'Driscoll Gene Expression Profiling , 2011, Methods in Molecular Biology.

[52]  Udaya B. Kogalur,et al.  spikeslab: Prediction and Variable Selection Using Spike and Slab Regression , 2010, R J..

[53]  Karl J. Friston,et al.  Variance Components , 2003 .

[54]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[55]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[56]  C. Mcgilchrist Estimation in Generalized Mixed Models , 1994 .

[57]  T. J. Mitchell,et al.  Bayesian variable selection in regression , 1987 .

[58]  G. Golub Matrix computations , 1983 .

[59]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.