Combining SVMs with Various Feature Selection Strategies

This article investigates the performance of combining support vector machines (SVM) and various feature selection strategies. Some of them are filter-type approaches: general feature selection methods independent of SVM, and some are wrapper-type methods: modifications of SVM which can be used to select features. We apply these strategies while participating to the NIPS 2003 Feature Selection Challenge and rank third as a group.

[1]  Chih-Jen Lin,et al.  Radius Margin Bounds for Support Vector Machines with the RBF Kernel , 2002, Neural Computation.

[2]  Tom M. Mitchell,et al.  Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.

[3]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[4]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[5]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[6]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[7]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[8]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[9]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[10]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[11]  Chih-Jen Lin,et al.  Formulations of Support Vector Machines: A Note from an Optimization Point of View , 2001, Neural Computation.

[12]  Jung-Ying Wang,et al.  Application of Support Vector Machines in Bioinformatics , 2002 .

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[15]  V. Vapnik,et al.  Bounds on Error Expectation for Support Vector Machines , 2000, Neural Computation.

[16]  Michael T. Manry,et al.  Automatic recognition of USGS land use/cover categories using statistical and neural network classifiers , 1993, Defense, Security, and Sensing.

[17]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[19]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[20]  Hsuan-Tien Lin A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods , 2005 .

[21]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[22]  Chih-Jen Lin,et al.  IJCNN 2001 challenge: generalization ability and text decoding , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[23]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[24]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[25]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[26]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[27]  Wei Chu,et al.  Bayesian Trigonometric Support Vector Classifier , 2003, Neural Computation.

[28]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[29]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[30]  Ting Wang,et al.  Application of Breiman's Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules , 2004, Multiple Classifier Systems.

[31]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[32]  Chih-Jen Lin,et al.  Chapter 12 Combining SVMs with Various Feature Selection Strategies , 2006 .

[33]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[34]  Samy Bengio,et al.  A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[35]  Daniel Cremers,et al.  Efficient Feature Subset Selection for Support Vector Machines , 2001 .

[36]  Chih-Jen Lin,et al.  A Guide to Support Vector Machines , 2006 .

[37]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[38]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..