A Column Generation Approach for Support Vector Machines

The widely used Support Vector Machine (SVM) method has shown to yield good results in Supervised Classification problems. Other methods such as Classification Trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in Data Mining. In this work, we propose an SVM-based method that automatically detects the most important predictor variables, and those values which are critical for the classification. Its classification ability is comparable to the standard linear SVM and clearly better than Classification Trees. Moreover, the proposed method is robust, i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables. The method involves the optimization of a Linear Programming problem with a large number of decision variables, for which we use the well-known This work has been partially supported by projects MTM2005-09362-C03-01 of MEC, Spain, and FQM-329 of Junta

[1]  O. Mangasarian Linear and Nonlinear Separation of Patterns by Linear Programming , 1965 .

[2]  Gerald W. Kimble,et al.  Information and Computer Science , 1975 .

[3]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[4]  John Shawe-Taylor,et al.  Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[5]  R. Gomory,et al.  A Linear Programming Approach to the Cutting-Stock Problem , 1961 .

[6]  Glenn Fung,et al.  A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[7]  Emilio Carrizosa,et al.  Two-group classification via a biobjective margin maximization model , 2006, Eur. J. Oper. Res..

[8]  O. Mangasarian,et al.  Optimization methods in massive data sets , 2002 .

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[11]  Bernhard Schölkopf,et al.  Semiparametric Support Vector and Linear Programming Machines , 1998, NIPS.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  A. Rubinov,et al.  Unsupervised and supervised data classification via nonsmooth and global optimization , 2003 .

[14]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[15]  Ralf Herbrich,et al.  Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[19]  J. Weston,et al.  Support vector density estimation , 1999 .

[20]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[21]  Noboru Murata,et al.  Support vector machines with different norms: motivation, formulations and results , 2001, Pattern Recognit. Lett..

[22]  Ralf Herbrich,et al.  Learning Kernel Classifiers , 2001 .