Feature selection based on sparse imputation

Feature selection, which aims to obtain valuable feature subsets, has been an active topic for years. How to design an evaluating metric is the key for feature selection. In this paper, we address this problem using imputation quality to search for the meaningful features and propose feature selection via sparse imputation (FSSI) method. The key idea is utilizing sparse representation criterion to test individual feature. The feature based classification is used to evaluate the proposed method. Comparative studies are conducted with classic feature selection methods (such as Fisher score and Laplacian score). Experimental results on benchmark data sets demonstrate the effectiveness of FSSI method.

[1]  Bert Cranen,et al.  Using sparse representations for missing data imputation in noise robust speech recognition , 2008, 2008 16th European Signal Processing Conference.

[2]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[3]  Fei Wang,et al.  Compressed Nonnegative Sparse Coding , 2010, 2010 IEEE International Conference on Data Mining.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Allen R. Hanson,et al.  Feature Selection Using Adaboost for Face Expression Recognition , 2005 .

[6]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[7]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[8]  Jin Xu,et al.  Dictionary Learning Based on Laplacian Score in Sparse Coding , 2011, MLDM.

[9]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[10]  Daniel W. C. Ho,et al.  Underdetermined blind source separation based on sparse representation , 2006, IEEE Transactions on Signal Processing.

[11]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Motoaki Kawanabe,et al.  Clustering with the Fisher Score , 2002, NIPS.

[13]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[14]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Lei Wang,et al.  Feature Selection with Kernel Class Separability , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yi Lin Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[19]  D. Donoho,et al.  Atomic Decomposition by Basis Pursuit , 2001 .

[20]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[21]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Jian-Bo Yang,et al.  Feature Selection Using Probabilistic Prediction of Support Vector Regression , 2011, IEEE Transactions on Neural Networks.

[23]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[25]  Linlin Shen,et al.  AdaBoost Gabor Feature Selection for Classification , 2004 .

[26]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[27]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[28]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[29]  T. Pavlenko On feature selection, curse-of-dimensionality and error probability in discriminant analysis , 2003 .

[30]  Huan Liu,et al.  Semi-supervised Feature Selection via Spectral Analysis , 2007, SDM.

[31]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[32]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[33]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..