Non-convex Regularized Self-representation for Unsupervised Feature Selection

Feature selection aims to select a subset of features to decrease time complexity, reduce storage burden and improve the generalization ability of classification or clustering. For the countless unlabeled high dimensional data, unsupervised feature selection is effective in alleviating the curse of dimension-ality and can find applications in various fields. In this paper, we propose a non-convex regularized self-representation RSR model where features can be represented by a linear combination of other features, and propose to impose L2,p norm 0i¾ź<i¾źpi¾ź<i¾ź1 regularization on self-representation coefficients for unsupervised feature selection. Compared with the conventional L2,1 norm regularization, when pi¾ź<i¾ź1, much sparser solution is obtained on the self-representation coefficients, and it is also more effective in selecting salient features. To solve the non-convex RSR model, we further propose an efficient iterative reweighted least squares IRLS algorithm with guaranteed convergence to fixed point. Extensive experimental results on nine datasets show that our feature selection method with small p is more effective. It mostly outperforms features selected at pi¾ź=i¾ź1 and other state-of-the-art unsupervised feature selection methods in terms of classification accuracy and clustering result.

[1]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[2]  Feiping Nie,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Feature Selection via Joint Embedding Learning and Sparse Regression , 2022 .

[3]  Xindong Wu,et al.  Feature selection using hierarchical feature clustering , 2011, CIKM '11.

[4]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[5]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[7]  Huan Liu,et al.  Unsupervised feature selection for linked social media data , 2012, KDD.

[8]  Simon C. K. Shiu,et al.  Unsupervised feature selection by regularized self-representation , 2015, Pattern Recognit..

[9]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[10]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[14]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[15]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[16]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[17]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[18]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[20]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[21]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[22]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.