FS_SFS: A novel feature selection method for support vector machines

In many pattern recognition applications, high-dimensional feature vectors impose a high computational cost as well as the risk of ''overfitting''. Feature Selection addresses the dimensionality reduction problem by determining a subset of available features which is most essential for classification. This paper presents a novel feature selection method named filtered and supported sequential forward search (FS_SFS) in the context of support vector machines (SVM). In comparison with conventional wrapper methods that employ the SFS strategy, FS_SFS has two important properties to reduce the time of computation. First, it dynamically maintains a subset of samples for the training of SVM. Because not all the available samples participate in the training process, the computational cost to obtain a single SVM classifier is decreased. Secondly, a new criterion, which takes into consideration both the discriminant ability of individual features and the correlation between them, is proposed to effectively filter out nonessential features. As a result, the total number of training is significantly reduced and the overfitting problem is alleviated. The proposed approach is tested on both synthetic and real data to demonstrate its effectiveness and efficiency. .

[1]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[2]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[3]  Kezhi Mao,et al.  Feature subset selection for support vector machines through discriminative function pruning analysis , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[5]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[6]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[9]  Kezhi Mao,et al.  Fast orthogonal forward selection algorithm for feature subset selection , 2002, IEEE Trans. Neural Networks.

[10]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[11]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[12]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[13]  Joachim M. Buhmann,et al.  Feature selection for support vector machines , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[14]  Shigeru Katagiri,et al.  Discriminative metric design for robust pattern recognition , 1997, IEEE Trans. Signal Process..

[15]  Massimiliano Pontil,et al.  Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[18]  Yuan F. Zheng,et al.  A GM-based multi-layer method for object tracking in video sequences , 2003, International Conference on Information Technology: Research and Education, 2003. Proceedings. ITRE2003..

[19]  Tomaso A. Poggio,et al.  Image Representations and Feature Selection for Multimedia Database Search , 2003, IEEE Trans. Knowl. Data Eng..

[20]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[21]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[22]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[23]  Anastasios Tefas,et al.  Using Support Vector Machines to Enhance the Performance of Elastic Graph Matching for Frontal Face Authentication , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Jong-Min Park,et al.  Active feature selection in optic nerve data using support vector machine , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[27]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[28]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[29]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.