Mixed effect modelling and variable selection for quantile regression

It is known that the estimating equations for quantile regression (QR) can be solved using an EM algorithm in which the M-step is computed via weighted least squares, with weights computed at the E-step as the expectation of independent generalized inverse-Gaussian variables. This fact is exploited here to extend QR to allow for random effects in the linear predictor. Convergence of the algorithm in this setting is established by showing that it is a generalized alternating minimization (GAM) procedure. Another modification of the EM algorithm also allows us to adapt a recently proposed method for variable selection in mean regression models to the QR setting. Simulations show that the resulting method significantly outperforms variable selection in QR models using the lasso penalty. Applications to real data include a frailty QR analysis of hospital stays, and variable selection for age at onset of lung cancer and for riboflavin production rate using high-dimensional gene expression arrays for prediction.

[1]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[2]  Hong Zhang,et al.  Targeting multiple signal transduction pathways through inhibition of Hsp90 , 2004, Journal of Molecular Medicine.

[3]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[6]  C. E. Galarza,et al.  Quantile regression in linear mixed models: a stochastic approximation EM approach. , 2017, Statistics and its interface.

[7]  Marco Geraci,et al.  Linear quantile mixed models , 2013, Statistics and Computing.

[8]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[9]  Jian Huang,et al.  Semismooth Newton Coordinate Descent Algorithm for Elastic-Net Penalized Huber Loss Regression and Quantile Regression , 2015, 1509.02957.

[10]  A. Belloni,et al.  L1-Penalized Quantile Regression in High Dimensional Sparse Models , 2009, 0904.2931.

[11]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[12]  Ruibin Xi,et al.  Bayesian regularized quantile regression , 2010 .

[13]  A. Belloni,et al.  Quantile graphical models: prediction and conditional independence with applications to systemic risk , 2016, 1607.00286.

[14]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  D. Bates,et al.  Mixed-Effects Models in S and S-PLUS , 2001 .

[17]  Density estimation in R , 2014 .

[18]  Sumanta Basu,et al.  Learning Financial Networks using Quantile Granger Causality , 2018, DSMM@SIGMOD.

[19]  Keming Yu,et al.  Bayesian quantile regression , 2001 .

[20]  Yusuke Nakamura,et al.  Identification of COX17 as a therapeutic target for non-small cell lung cancer. , 2003, Cancer research.

[21]  Bernard W. Silverman,et al.  Functional Data Analysis , 1997 .

[22]  H. Kozumi,et al.  Gibbs sampling methods for Bayesian quantile regression , 2011 .

[23]  William J. Byrne,et al.  Convergence Theorems for Generalized Alternating Minimization Procedures , 2005, J. Mach. Learn. Res..

[24]  Karl J. Friston,et al.  Variance Components , 2003 .

[25]  V. Carey,et al.  Mixed-Effects Models in S and S-Plus , 2001 .

[26]  I. Johnstone,et al.  Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences , 2004, math/0410088.

[27]  M. Bottai,et al.  Quantile regression for longitudinal data using the asymmetric Laplace distribution. , 2007, Biostatistics.

[28]  Peter Bühlmann,et al.  High-Dimensional Statistics with a View Toward Applications in Biology , 2014 .

[29]  Ivano Bertini,et al.  Mitochondrial copper(I) transfer from Cox17 to Sco1 is coupled to electron transfer , 2008, Proceedings of the National Academy of Sciences.

[30]  J. Zico Kolter,et al.  The Multiple Quantile Graphical Model , 2016, NIPS.

[31]  A. Kottas,et al.  A Bayesian Nonparametric Approach to Inference for Quantile Regression , 2010 .

[32]  V. H. L. Dávila,et al.  Robust quantile regression using a generalized class of skewed distributions , 2017 .

[33]  M. Wells,et al.  GENERALIZED THRESHOLDING ESTIMATORS FOR HIGH-DIMENSIONAL LOCATION PARAMETERS , 2010 .

[34]  M. Bottai,et al.  A penalized approach to covariate selection through quantile regression coefficient models , 2020, Statistical Modelling.

[35]  A. V. D. Vaart,et al.  Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences , 2012, 1211.1197.

[36]  D. Ruppert,et al.  Trimmed Least Squares Estimation in the Linear Model , 1980 .

[37]  Liang Chen,et al.  Quantile Factor Models , 2017, Econometrica.

[38]  Christopher J. Chang,et al.  A targetable fluorescent sensor reveals that copper-deficient SCO1 and SCO2 patient cells prioritize mitochondrial copper homeostasis. , 2011, Journal of the American Chemical Society.

[39]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[40]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[41]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[42]  K. Yamamoto,et al.  Disassembly of Transcriptional Regulatory Complexes by Molecular Chaperones , 2002, Science.

[43]  J. Myung,et al.  Expressional patterns of chaperones in ten human tumor cell lines , 2004, Proteome Science.

[44]  Ying-hui Zhou,et al.  Quantile Regression via the EM Algorithm , 2014, Commun. Stat. Simul. Comput..

[45]  M. Wand Functions for Kernel Smoothing Supporting Wand & Jones (1995) , 2015 .

[46]  D. Hunter,et al.  Quantile Regression via an MM Algorithm , 2000 .

[47]  R. Koenker Quantile regression for longitudinal data , 2004 .

[48]  R. Koenker Confidence Intervals for Regression Quantiles , 1994 .

[49]  Martin T. Wells,et al.  A Scalable Empirical Bayes Approach to Variable Selection , 2015, 1510.03781.

[50]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.