Increased classification accuracy and speedup through pair-wise feature selection for support vector machines

Support vector machines are binary classifiers that can implement multi-class classifiers by creating a classifier for each possible combination of classes or for each class using a one class versus all strategy. Feature selection algorithms often search for a single set of features to be used by each of the binary classifiers. This ignores the fact that features that may be good discriminators for two particular classes might not do well for other class combinations. As a result, the feature selection process may not include these features in the common set to be used by all support vector machines. It is shown that by selecting features for each binary class combination, overall classification accuracy can be improved (as much as 2.1%), feature selection time can be significantly reduced (speed up of 3.2 times), and time required for training a multi-class support vector machine is reduced. Another benefit of this approach is that considerably less time is required for feature selection when additional classes are added to the training data. This is because the features selected for the existing class combinations are still valid, so that feature selection only needs to be run for the new class combinations created.

[1]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  O. Chapelle Multi-Class Feature Selection with Support Vector Machines , 2008 .

[4]  Scott Samson,et al.  A system for high-resolution zooplankton imaging , 2001 .

[5]  Xiang-Yan Zeng,et al.  Multi-class feature selection for texture classification , 2006, Pattern Recognit. Lett..

[6]  Ana L. N. Fred,et al.  Pairwise vs global multi-class wrapper feature selection , 2007 .

[7]  Lawrence O. Hall,et al.  Fast Support Vector Machines for Continuous Data , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[9]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Multiclass SVM Model Selection Using Particle Swarm Optimization , 2006, 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS'06).

[10]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[11]  Kurt Kramer,et al.  System for Identifying Plankton from the SIPPER Instrument Platform , 2010 .

[12]  Lawrence O. Hall,et al.  Recognizing plankton images from the shadow image particle profiling evaluation recorder , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Andrew Remsen,et al.  Evolution and field application of a plankton imaging system , 2008 .

[14]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[15]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[16]  David Furcy,et al.  Limited Discrepancy Beam Search , 2005, IJCAI.

[17]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[18]  Lawrence O. Hall,et al.  Active learning to recognize multiple types of plankton , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..