The Neural Support Vector Machine

This paper describes a new machine learning algorithm for regression and dimensionality reduction tasks. The Neural Support Vector Machine (NSVM) is a hybrid learning algorithm consisting of neural networks and support vector machines (SVMs). The output of the NSVM is given by SVMs that take a central feature layer as their input. The feature-layer representation is the output of a number of neural networks that are trained to minimize the dual objectives of the SVMs. Because the NSVM uses a shared feature layer, the learning architecture is able to handle multiple outputs and therefore it can also be used as a dimensionality reduction method. The results on 7 regression datasets show that the NSVM in general outperforms a standard SVM and a multi-layer perceptron. Furthermore, experiments on eye images show that the NSVM autoencoder outperforms state-of-the-art dimensionality reduction methods.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[5]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[6]  Pascal Vincent,et al.  A Neural Support Vector Network architecture with adaptive kernels , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[7]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[8]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Zbigniew Telec,et al.  Nonparametric Statistical Analysis of Machine Learning Algorithms for Regression Problems , 2010, KES.

[11]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[12]  Nikhil R. Pal,et al.  NEUROSVM: An Architecture to Reduce the Effect of the Choice of Kernel on the Performance of SVM , 2009, J. Mach. Learn. Res..

[13]  Johan A. K. Suykens,et al.  Training multilayer perceptron classifiers based on a modified support vector method , 1999, IEEE Trans. Neural Networks.

[14]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[15]  Garrison W. Cottrell,et al.  Non-Linear Dimensionality Reduction , 1992, NIPS.

[16]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[17]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[18]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.