论文信息 - The Kernel-Adatron Algorithm: A Fast and Simple Learning Procedure for Support Vector Machines

The Kernel-Adatron Algorithm: A Fast and Simple Learning Procedure for Support Vector Machines

Support Vector Machines work by mapping training data for classiication tasks into a high dimensional feature space. In the feature space they then nd a maximal margin hyperplane which separates the data. This hyperplane is usually found using a quadratic programming routine which is computation-ally intensive, and is non trivial to implement. In this paper we propose an adaptation of the Adatron algorithm for clas-siication with kernels in high dimensional spaces. The algorithm is simple and can nd a solution very rapidly with an exponentially fast rate of convergence (in the number of iterations) towards the optimal solution. Experimental results with real and artiicial datasets are provided.

[1] M. Aizerman,et al. Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[2] Opper. Learning times of neural networks: Exact solution for a PERCEPTRON algorithm. , 1988, Physical review. A, General physics.

[3] M. Opper. Learning in Neural Networks: Solvable Dynamics , 1989 .

[4] W. Kinzel. Statistical mechanics of the perceptron with maximal stability , 1990 .

[5] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[6] T. Watkin,et al. THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .

[7] Isabelle Guyon,et al. Discovering Informative Patterns and Data Cleaning , 1996, Advances in Knowledge Discovery and Data Mining.

[8] Corinna Cortes,et al. Prediction of Generalization Ability in Learning Machines , 1994 .

[9] D. Signorini,et al. Neural networks , 1995, The Lancet.

[10] Harris Drucker,et al. Comparison of learning algorithms for handwritten digit recognition , 1995 .

[11] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[12] Bernhard Schölkopf,et al. Support vector learning , 1997 .

[13] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[14] Federico Girosi,et al. Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15] Bernhard Schölkopf,et al. Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[16] D. Saad. Europhysics Letters , 1997 .

[17] P. Bartlett,et al. Generalization Performance of Support Vector Machines and Other Pattern Classifiers , 1999 .

[18] John Shawe-Taylor,et al. Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[19] Nello Cristianini,et al. Bayesian Classifiers Are Large Margin Hyperplanes in a Hilbert Space , 1998, ICML.

[20] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.