论文信息 - Laplacian Score for Feature Selection

Laplacian Score for Feature Selection

In supervised learning scenarios, feature selection has been studied widely in the literature. Selecting features in unsupervised learning scenarios is a much harder problem, due to the absence of class labels that would guide the search for relevant information. And, almost all of previous unsupervised feature selection methods are "wrapper" techniques that require a learning algorithm to evaluate the candidate feature subsets. In this paper, we propose a "filter" method for feature selection which is independent of any learning algorithm. Our method can be performed in either supervised or unsupervised fashion. The proposed method is based on the observation that, in many real world classification problems, data from the same class are often close to each other. The importance of a feature is evaluated by its power of locality preserving, or, Laplacian Score. We compare our method with data variance (unsupervised) and Fisher score (supervised) on two data sets. Experimental results demonstrate the effectiveness and efficiency of our algorithm.

[1] Ron Kohavi,et al. Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2] Xin Liu,et al. Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[3] Mikhail Belkin,et al. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[4] Xiaofei He,et al. Locality Preserving Projections , 2003, NIPS.

[5] Fan Chung,et al. Spectral Graph Theory , 1996 .