Groups-Keeping Solution Path Algorithm for Sparse Regression with Automatic Feature Grouping

Feature selection is one of the most important data mining research topics with many applications. In practical problems, features often have group structure to effect the outcomes. Thus, it is crucial to automatically identify homogenous groups of features for high-dimensional data analysis. Octagonal shrinkage and clustering algorithm for regression (OSCAR) is an important sparse regression approach with automatic feature grouping and selection by ℓ1 norm and pairwise ℓ∞ norm. However, due to over-complex representation of the penalty (especially the pairwise ℓ∞ norm), so far OSCAR has no solution path algorithm which is mostly useful for tuning the model. To address this challenge, in this paper, we propose a groups-keeping solution path algorithm to solve the OSCAR model (OscarGKPath). Given a set of homogenous groups of features and an accuracy bound ε, OscarGKPath can fit the solutions in an interval of regularization parameters while keeping the feature groups. The entire solution path can be obtained by combining multiple such intervals. We prove that all solutions in the solution path produced by OscarGKPath can strictly satisfy the given accuracy bound ε. The experimental results on benchmark datasets not only confirm the effectiveness of our OscarGKPath algorithm, but also show the superiority of our OscarGKPath in cross validation compared with the existing batch algorithm.

[1]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[2]  Julien Mairal,et al.  Convex optimization with sparsity-inducing norms , 2011 .

[3]  C. Geyer,et al.  Adaptive regularization using the entire solution surface. , 2009, Biometrika.

[4]  G.T. Herman,et al.  A survey of 3D medical imaging technologies , 1990, IEEE Engineering in Medicine and Biology Magazine.

[5]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Exact Top-k Feature Selection via ℓ2,0-Norm Constraint , 2022 .

[6]  Roberto M. Lang,et al.  Recommendations for Cardiac Chamber Quantification by Echocardiography in Adults: An Update from the American Society of Echocardiography and the European Association of, Cardiovascular Imaging. , 2016, European heart journal cardiovascular Imaging.

[7]  H. Bondell,et al.  Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR , 2008, Biometrics.

[8]  Zi Wang,et al.  An Asynchronous Distributed Proximal Gradient Method for Composite Convex Optimization , 2014, ICML.

[9]  Shannon L. Risacher,et al.  Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the ADNI cohort , 2012, Bioinform..

[10]  Rasmus Bro,et al.  Variable selection in regression—a tutorial , 2010 .

[11]  Julien Mairal,et al.  Network Flow Algorithms for Structured Sparsity , 2010, NIPS.

[12]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[13]  Prateek Jain,et al.  Sparse Local Embeddings for Extreme Multi-label Classification , 2015, NIPS.

[14]  Shannon L. Risacher,et al.  Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance , 2011, 2011 International Conference on Computer Vision.

[15]  Yi Peng,et al.  Feature Selection via l p -Norm Support Vector Machines , 2011 .

[16]  Ryan J. Tibshirani,et al.  Supplement : Proofs and Technical Details for “ The Solution Path of the Generalized Lasso ” , 2013 .

[17]  Bin Gu,et al.  A New Generalized Error Path Algorithm for Model Selection , 2015, ICML.

[18]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[19]  David Poole,et al.  Linear Algebra: A Modern Introduction , 2002 .

[20]  Henry Adams,et al.  Persistence Images: A Stable Vector Representation of Persistent Homology , 2015, J. Mach. Learn. Res..

[21]  M. Heller DNA microarray technology: devices, systems, and applications. , 2002, Annual review of biomedical engineering.

[22]  Jiang Gui,et al.  Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data , 2005, Bioinform..

[23]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[24]  Bin Gu,et al.  A Robust Regularization Path Algorithm for $\nu $ -Support Vector Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Leon Wenliang Zhong,et al.  Efficient Sparse Modeling With Automatic Feature Grouping , 2011, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Y. She Sparse regression with exact clustering , 2008 .

[27]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[28]  Jianbo Yang,et al.  An Improved Algorithm for the Solution of the Regularization Path of Support Vector Machine , 2010, IEEE Transactions on Neural Networks.

[29]  Tor Lattimore,et al.  Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities , 2017, J. Mach. Learn. Res..

[30]  Bin Gu,et al.  Cross Validation Through Two-Dimensional Solution Surface for Cost-Sensitive SVM , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  S. Rosset,et al.  Piecewise linear regularized solution paths , 2007, 0708.2197.

[32]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[33]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .