Adaptive group-regularized logistic elastic net regression

In high-dimensional data settings, additional information on the features is often available. Examples of such external information in omics research are: (i) $p$-values from a previous study and (ii) omics annotation. The inclusion of this information in the analysis may enhance classification performance and feature selection but is not straightforward. We propose a group-regularized (logistic) elastic net regression method, where each penalty parameter corresponds to a group of features based on the external information. The method, termed gren, makes use of the Bayesian formulation of logistic elastic net regression to estimate both the model and penalty parameters in an approximate empirical-variational Bayes framework. Simulations and applications to three cancer genomics studies and one Alzheimer metabolomics study show that, if the partitioning of the features is informative, classification performance, and feature selection are indeed enhanced.

[1]  Igor Jurisica,et al.  Optimized application of penalized regression methods to diverse genomic data , 2011, Bioinform..

[2]  G. Casella,et al.  Penalized regression, standard errors, and Bayesian lassos , 2010 .

[3]  Cun-Hui Zhang,et al.  Adaptive Lasso for sparse high-dimensional regression models , 2008 .

[4]  Jianqing Fan,et al.  A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[5]  James G. Scott,et al.  Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction , 2022 .

[6]  Vivekananda Roy,et al.  Selection of Tuning Parameters, Solution Paths and Standard Errors for Bayesian Lassos , 2017 .

[7]  G. Casella Empirical Bayes Gibbs sampling. , 2001, Biostatistics.

[8]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[9]  Ruud H. Brakenhoff,et al.  Prognostic modeling of oral cancer by gene profiles and clinicopathological co-variables , 2017, Oncotarget.

[10]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[11]  Gwenaël G R Leday,et al.  Gene Network Reconstruction using Global-Local Shrinkage Priors. , 2015, The annals of applied statistics.

[12]  Jian Huang,et al.  A Selective Review of Group Selection in High-Dimensional Models. , 2012, Statistical science : a review journal of the Institute of Mathematical Statistics.

[13]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[14]  Jonathan Pevsner,et al.  Gene expression alterations over large chromosomal regions in cancers include multiple genes unrelated to malignant progression. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[16]  James G. Scott,et al.  Handling Sparsity via the Horseshoe , 2009, AISTATS.

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[19]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[20]  Jianguo Sun,et al.  Variable selection for high-dimensional genomic data with censored outcomes using group lasso prior , 2017, Comput. Stat. Data Anal..

[21]  Jian Huang,et al.  Penalized methods for bi-level variable selection. , 2009, Statistics and its interface.

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  James G. Scott,et al.  Local shrinkage rules, Lévy processes and regularized regression , 2010, 1010.3390.

[24]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[25]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[26]  M. Stephens,et al.  Scalable Variational Inference for Bayesian Variable Selection in Regression, and Its Accuracy in Genetic Association Studies , 2012 .

[27]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[28]  Ruud H. Brakenhoff,et al.  Improved high-dimensional prediction with Random Forests by the use of co-data , 2017, BMC Bioinformatics.

[29]  Shuang Xu,et al.  A novel variational Bayesian method for variable selection in logistic regression models , 2019, Comput. Stat. Data Anal..

[30]  Wei Pan,et al.  Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms , 2007, Bioinform..

[31]  S. Geer,et al.  The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso) , 2011 .

[32]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[33]  Patrick Breheny,et al.  The group exponential lasso for bi‐level variable selection , 2015, Biometrics.

[34]  Mark A. van de Wiel,et al.  Better diagnostic signatures from RNAseq data through use of auxiliary co‐data , 2017, Bioinform..

[35]  Daniel J. Wilson,et al.  The harmonic mean p-value for combining dependent tests , 2019, Proceedings of the National Academy of Sciences.

[36]  A. V. D. Vaart,et al.  BAYESIAN LINEAR REGRESSION WITH SPARSE PRIORS , 2014, 1403.0735.

[37]  Nengjun Yi,et al.  Bayesian Methods for High Dimensional Linear Models. , 2013, Journal of biometrics & biostatistics.

[38]  Rahim Alhamzawi,et al.  The Bayesian elastic net regression , 2018, Commun. Stat. Simul. Comput..

[39]  Sounak Chakraborty,et al.  A Bayesian hybrid Huberized support vector machine and its applications in high-dimensional medical data , 2011, Comput. Stat. Data Anal..

[40]  Yaohui Zeng,et al.  Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection , 2015, Cancer informatics.

[41]  Jean-Michel Marin,et al.  Mean-field variational approximate Bayesian inference for latent variable models , 2007, Comput. Stat. Data Anal..

[42]  Mark A van de Wiel,et al.  Combination of a six microRNA expression profile with four clinicopathological factors for response prediction of systemic treatment in patients with advanced colorectal cancer , 2018, PloS one.

[43]  Philip Smith,et al.  Knot selection for least-squares and penalized splines , 2013 .

[44]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[45]  Wessel N van Wieringen,et al.  Better prediction by use of co‐data: adaptive group‐regularized ridge regression , 2014, Statistics in medicine.

[46]  M. A. van de Wiel,et al.  MiR expression profiles of paired primary colorectal cancer and metastases by next-generation sequencing , 2015, Oncogenesis.

[47]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[48]  Hao Helen Zhang,et al.  ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS. , 2009, Annals of statistics.

[49]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of head and neck squamous cell carcinomas , 2015, Nature.

[50]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[51]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[52]  Qing Li,et al.  The Bayesian elastic net , 2010 .

[53]  Van Der Vaart,et al.  The Horseshoe Estimator: Posterior Concentration around Nearly Black Vectors , 2014, 1404.0202.

[54]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .

[55]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[56]  Frank Dondelinger,et al.  The joint lasso: high-dimensional regression for group structured data , 2018, Biostatistics.

[57]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[58]  Bo Wang,et al.  Inadequacy of interval estimates corresponding to variational Bayesian approximations , 2005, AISTATS.

[59]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[60]  Peter McCullagh,et al.  Laplace Approximation of High Dimensional Integrals , 1995 .

[61]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.