A General Framework for Sparsity Regularized Feature Selection via Iteratively Reweighted Least Square Minimization

A variety of feature selection methods based on sparsity regularization have been developed with different loss functions and sparse regularization functions. Capitalizing on the existing sparsity regularized feature selection methods, we propose a general sparsity feature selection (GSR-FS) algorithm that optimizes a l 2, r (0 <  r ≤ 2) based loss function with a l 2, p -norm (0 < p ≤ 2) sparse regularization. The l 2, r - norm (0 < 𝑟 ≤ 2) based loss function brings flexibility to balance data-fitting and robustness to outliers by tuning its parameter, and the l 2, p -norm (0 < p ≤ 1) based regularization function is able to boost the sparsity for feature selection. To solve the optimization problem with multiple non-smooth and non-convex functions when , we develop an efficient solver under the general umbrella of Iterative Reweighted Least Square (IRLS) algorithms. Our algorithm has been proved to converge with a theoretical convergence order of min(2 – r, 2 – p ) at least . The experimental results have demonstrated that our method could achieve competitive feature selection performance on publicly available datasets compared with state-of-the-art feature selection methods, with reduced computational cost.

[1]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[4]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[5]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[6]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Gavin C. Cawley,et al.  Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation , 2006, NIPS.

[8]  Michael I. Jordan,et al.  Multi-task feature selection , 2006 .

[9]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[10]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[11]  Rick Chartrand,et al.  Exact Reconstruction of Sparse Signals via Nonconvex Minimization , 2007, IEEE Signal Processing Letters.

[12]  Hao Helen Zhang,et al.  Support vector machines with adaptive Lq penalty , 2007, Comput. Stat. Data Anal..

[13]  R. Chartrand,et al.  Restricted isometry properties and nonconvex compressive sensing , 2007 .

[14]  Li Wang,et al.  Hybrid huberized support vector machines for microarray classification , 2007, ICML '07.

[15]  Wotao Yin,et al.  Iteratively reweighted algorithms for compressive sensing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[17]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Feiping Nie,et al.  Discriminative Least Squares Regression for Multiclass Classification and Feature Selection , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Songcan Chen,et al.  $l_{2,p}$ Matrix Norm and Its Application in Feature Selection , 2013, ArXiv.

[21]  Zongben Xu,et al.  Regularization: Convergence of Iterative Half Thresholding Algorithm , 2014 .

[22]  Feiping Nie,et al.  Feature Selection at the Discrete Limit , 2014, AAAI.

[23]  Chris H. Q. Ding,et al.  Non-Convex Feature Learning via L_{p, inf} Operator , 2014, AAAI.

[24]  Yunchao Wei,et al.  Proximal Iteratively Reweighted Algorithm with Multiple Splitting for Nonconvex Sparsity Optimization , 2014, AAAI.

[25]  Shuicheng Yan,et al.  Smoothed Low Rank and Sparse Matrix Recovery by Iteratively Reweighted Least Squares Minimization , 2014, IEEE Transactions on Image Processing.

[26]  Yong Fan,et al.  Direct Sparsity Optimization Based Feature Selection for Multi-Class Classification , 2016, IJCAI.