Evolutionary learning with kernels: a generic solution for large margin problems

In this paper we embed evolutionary computation into statistical learning theory. First, we outline the connection between large margin optimization and statistical learning and see why this paradigm is successful for many pattern recognition problems. We then embed evolutionary computation into the most prominent representative of this class of learning methods, namely into Support Vector Machines (SVM). In contrast to former applications of evolutionary algorithms to SVMs we do not only optimize the method or kernel parameters. We rather use both evolution strategies and particle swarm optimization in order to directly solve the posed constrained optimization problem. Transforming the problem into the Wolfe dual reduces the total runtime and allows the usage of kernel functions. Exploiting the knowledge about this optimization problem leads to a hybrid mutation which further decreases convergence time while classification accuracy is preserved. We will show that evolutionary SVMs are at least as accurate as their quadratic programming counterparts on six real-world benchmark data sets. The evolutionary SVM variants frequently outperform their quadratic programming competitors. Additionally, the proposed algorithm is more generic than existing traditional solutions since it will also work for non-positive semidefinite kernel functions and for several, possibly competing, performance criteria.

[1]  Tobias Storch On the impact of objective function transformations on evolutionary and black-box algorithms , 2005, GECCO '05.

[2]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[3]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[4]  Alexander J. Smola,et al.  Learning with non-positive kernels , 2004, ICML.

[5]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[6]  Bernhard Schölkopf,et al.  Feature Selection for Support Vector Machines Using Genetic Algorithms , 2004, Int. J. Artif. Intell. Tools.

[7]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[8]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[9]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[10]  Thomas Philip Runarsson,et al.  Asynchronous Parallel Evolutionary Model Selection for Support Vector Machines , 2004 .

[11]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[12]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[13]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[14]  B. Schölkopf,et al.  General cost functions for support vector regression. , 1998 .

[15]  José Luis Rojo-Álvarez,et al.  Fuzzy sigmoid kernel for support vector classifiers , 2004, Neurocomputing.

[16]  Michael G. Madden,et al.  The Genetic Kernel Support Vector Machine: Description and Evaluation , 2005, Artificial Intelligence Review.

[17]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[18]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[19]  Hsuan-Tien Lin A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods , 2005 .

[20]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[21]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[22]  Alexander J. Smola,et al.  Regularization with Dot-Product Kernels , 2000, NIPS.

[23]  Bernard Haasdonk,et al.  Feature space interpretation of SVMs with indefinite kernels , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.