Nonlinear Joint Unsupervised Feature Selection

In the era of big data, one is often confronted with the problem of high dimensional data for many machine learning or data mining tasks. Feature selection, as a dimension reduction technique, is useful for alleviating the curse of dimensionality while preserving interpretability. In this paper, we focus on unsupervised feature selection, as class labels are usually expensive to obtain. Unsupervised feature selection is typically more challenging than its supervised counterpart due to the lack of guidance from class labels. Recently, regression-based methods with L2,1 norms have gained much popularity as they are able to evaluate features jointly which, however, consider only linear correlations between features and pseudo-labels. In this paper, we propose a novel nonlinear joint unsupervised feature selection method based on kernel alignment. The aim is to find a succinct set of features that best aligns with the original features in the kernel space. It can evaluate features jointly in a nonlinear manner and provides a good ‘0/1’ approximation for the selection indicator vector. We formulate it as a constrained optimization problem and develop a Spectral Projected Gradient (SPG) method to solve the optimization problem. Experimental results on several real-world datasets demonstrate that our proposed method outperforms the state-of-the-art approaches significantly.

[1]  ChengXiang Zhai,et al.  Robust Unsupervised Feature Selection , 2013, IJCAI.

[2]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Chun Chen,et al.  Graph Regularized Sparse Coding for Image Representation , 2011, IEEE Transactions on Image Processing.

[5]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[6]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[7]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[8]  Philip S. Yu,et al.  Tensor-Based Multi-view Feature Selection with Applications to Brain Diseases , 2014, 2014 IEEE International Conference on Data Mining.

[9]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[10]  L. Grippo,et al.  A nonmonotone line search technique for Newton's method , 1986 .

[11]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[12]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[13]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[14]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[15]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[16]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[17]  José Mario Martínez,et al.  Nonmonotone Spectral Projected Gradient Methods on Convex Sets , 1999, SIAM J. Optim..

[18]  B. Schölkopf,et al.  Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation , 2007 .

[19]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[20]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[21]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[22]  Philip S. Yu,et al.  Efficient Partial Order Preserving Unsupervised Feature Selection on Networks , 2015, SDM.

[23]  Lei Shi,et al.  Robust Spectral Learning for Unsupervised Feature Selection , 2014, 2014 IEEE International Conference on Data Mining.

[24]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[25]  Yong Luo,et al.  Group Sparse Multiview Patch Alignment Framework With View Consistency for Image Classification , 2014, IEEE Transactions on Image Processing.

[26]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.