Unsupervised Feature Selection by Preserving Stochastic Neighbors

Feature selection is an important technique for alleviating the curse of dimensionality. Unsupervised feature selection is more challenging than its supervised counterpart due to the lack of labels. In this paper, we present an effective method, Stochastic Neighborpreserving Feature Selection (SNFS), for selecting discriminative features in unsupervised setting. We employ the concept of stochastic neighbors and select the features that can best preserve such stochastic neighbors by minimizing the KullbackLeibler (KL) Divergence between neighborhood distributions. The proposed approach measures feature utility jointly in a nonlinear way and discriminative features can be selected due to its ’push-pull’ property. We develop an efficient algorithm for optimizing the objective function based on projected quasi-Newton method. Moreover, few existing methods provide ways for determining the optimal number of selected features and this hampers their utility in practice. Our approach is equipped with a guideline for choosing the number of features, which provides nearly optimal performance in our experiments. Experimental results show that the proposed method outperforms state-ofthe-art methods significantly on several realworld datasets.

[1]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[2]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[3]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[4]  Philip S. Yu,et al.  Efficient Partial Order Preserving Unsupervised Feature Selection on Networks , 2015, SDM.

[5]  Lei Shi,et al.  Robust Spectral Learning for Unsupervised Feature Selection , 2014, 2014 IEEE International Conference on Data Mining.

[6]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[7]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[10]  Quanquan Gu,et al.  Local Learning Regularized Nonnegative Matrix Factorization , 2009, IJCAI.

[11]  Philip S. Yu,et al.  Unsupervised Feature Selection on Networks: A Generative View , 2016, AAAI.

[12]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[13]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[14]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[17]  D. Bertsekas,et al.  TWO-METRIC PROJECTION METHODS FOR CONSTRAINED OPTIMIZATION* , 1984 .

[18]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[19]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  ChengXiang Zhai,et al.  Robust Unsupervised Feature Selection , 2013, IJCAI.

[21]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[22]  B. Schölkopf,et al.  Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation , 2007 .

[23]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[24]  D. Bertsekas Projected Newton methods for optimization problems with simple constraints , 1981, CDC 1981.