Large Margin Subspace Learning for feature selection

Recent research has shown the benefits of large margin framework for feature selection. In this paper, we propose a novel feature selection algorithm, termed as Large Margin Subspace Learning (LMSL), which seeks a projection matrix to maximize the margin of a given sample, defined as the distance between the nearest missing (the nearest neighbor with the different label) and the nearest hit (the nearest neighbor with the same label) of the given sample. Instead of calculating the nearest neighbor of the given sample directly, we treat each sample with different (same) labels with the given sample as a potential nearest missing (hint), with the probability estimated by kernel density estimation. By this way, the nearest missing (hint) is calculated as an expectation of all different (same) class samples. In order to perform feature selection, an @?"2","1-norm is imposed on the projection matrix to enforce row-sparsity. An efficient algorithm is then proposed to solve the resultant optimization problem. Comprehensive experiments are conducted to compare the performance of the proposed algorithm with the other five state-of-the-art algorithms RFS, SPFS, mRMR, TR and LLFS, it achieves better performance than the former four. Compared with the algorithm LLFS, the proposed algorithm has a competitive performance with however a significantly faster computational.

[1]  Chris H. Q. Ding,et al.  R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization , 2006, ICML.

[2]  Shuicheng Yan,et al.  Neighborhood preserving embedding , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Jiawei Han,et al.  Joint Feature Selection and Subspace Learning , 2011, IJCAI.

[4]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[5]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[6]  Sinisa Todorovic,et al.  Local-Learning-Based Feature Selection for High-Dimensional Data Analysis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[8]  Limei Zhang,et al.  Graph optimization for dimensionality reduction with sparsity constraints , 2012, Pattern Recognit..

[9]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[11]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[13]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[14]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[15]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[16]  J KriegmanDavid,et al.  Eigenfaces vs. Fisherfaces , 1997 .

[17]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[18]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[19]  Pavel Pudil,et al.  Novel Methods for Subset Selection with Respect to Problem Knowledge , 1998, IEEE Intell. Syst..

[20]  David G. Stork,et al.  Pattern Classification , 1973 .

[21]  Radu Mihnea Udrea,et al.  Visual-oriented morphological foreground content grayscale frames interpolation method , 2009, J. Electronic Imaging.

[22]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[23]  Zehang Sun,et al.  Object detection using feature subset selection , 2004, Pattern Recognit..

[24]  Li Wang,et al.  Hybrid huberized support vector machines for microarray classification , 2007, ICML '07.

[25]  Hujun Bao,et al.  A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[27]  Zi Huang,et al.  Dimensionality reduction by Mixed Kernel Canonical Correlation Analysis , 2012, Pattern Recognition.

[28]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[29]  Franz Pernkopf,et al.  Stochastic margin-based structure learning of Bayesian network classifiers , 2013, Pattern Recognit..

[30]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[31]  Fernando De la Torre,et al.  Optimal feature selection for support vector machines , 2010, Pattern Recognit..

[32]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[33]  YanShuicheng,et al.  Graph Embedding and Extensions , 2007 .

[34]  Yuan Yan Tang,et al.  Improving the discriminant ability of local margin based learning method by incorporating the global between-class separability criterion , 2009, Neurocomputing.

[35]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[36]  Li Wang,et al.  Hybrid huberized support vector machines for microarray classification and gene selection , 2008, Bioinform..

[37]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[38]  Zi Huang,et al.  Self-taught dimensionality reduction on the high-dimensional small-sized data , 2013, Pattern Recognit..

[39]  Naftali Tishby,et al.  Margin based feature selection - theory and algorithms , 2004, ICML.

[40]  Yuan Yan Tang,et al.  Elastic registration for retinal images based on reconstructed vascular trees , 2006, IEEE Transactions on Biomedical Engineering.

[41]  Radu Mihnea Udrea,et al.  Iterative generalization of morphological skeleton , 2007, J. Electronic Imaging.

[42]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[43]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[44]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[45]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[46]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[47]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[48]  Filiberto Pla,et al.  Supervised feature selection by clustering using conditional mutual information-based distances , 2010, Pattern Recognit..