Correlated Feature Selection with Extended Exclusive Group Lasso

In many high dimensional classification or regression problems set in a biological context, the complete identification of the set of informative features is often as important as predictive accuracy, since this can provide mechanistic insight and conceptual understanding. Lasso and related algorithms have been widely used since their sparse solutions naturally identify a set of informative features. However, Lasso performs erratically when features are correlated. This limits the use of such algorithms in biological problems, where features such as genes often work together in pathways, leading to sets of highly correlated features. In this paper, we examine the performance of a Lasso derivative, the exclusive group Lasso, in this setting. We propose fast algorithms to solve the exclusive group Lasso, and introduce a solution to the case when the underlying group structure is unknown. The solution combines stability selection with random group allocation and introduction of artificial features. Experiments with both synthetic and real-world data highlight the advantages of this proposed methodology over Lasso in comprehensive selection of informative features.

[1]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[2]  Masashi Shimbo,et al.  Data-Dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion , 2018, AAAI.

[3]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[4]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[5]  William S. DeWitt,et al.  Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire , 2017, Nature Genetics.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Chenglin Miao,et al.  Uncorrelated Patient Similarity Learning , 2018, SDM.

[8]  Yijun Huang,et al.  Exclusive Sparsity Norm Minimization With Random Groups via Cone Projection , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Feiping Nie,et al.  Exclusive Feature Learning on Arbitrary Structures via \ell_{1, 2}-norm , 2014, NIPS.

[10]  G. Wahba,et al.  A NOTE ON THE LASSO AND RELATED PROCEDURES IN MODEL SELECTION , 2006 .

[11]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[12]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[13]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[14]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  F. Burnet A modification of jerne's theory of antibody production using the concept of clonal selection , 1976, CA: a cancer journal for clinicians.

[17]  Han Liu,et al.  Some Two-Step Procedures for Variable Selection in High-Dimensional Linear Regression , 2008, 0810.1644.

[18]  Kris Laukens,et al.  Memory CD4+ T cell receptor repertoire data mining as a tool for identifying cytomegalovirus serostatus , 2018, Genes & Immunity.

[19]  Katya Scheinberg,et al.  Noname manuscript No. (will be inserted by the editor) Efficient Block-coordinate Descent Algorithms for the Group Lasso , 2022 .

[20]  Martin J. Wainwright,et al.  Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.

[21]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[22]  Xiang Li,et al.  $\ell_P$ Norm Independently Interpretable Regularization Based Sparse Coding for Highly Correlated Data , 2019, IEEE Access.

[23]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[24]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[25]  Feiping Nie,et al.  A Probabilistic Derivation of LASSO and L12-Norm Feature Selections , 2019, AAAI.

[26]  Volker Roth,et al.  A Complete Analysis of the l_1, p Group-Lasso , 2012, ICML.

[27]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[28]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .