Learning of Kernel Functions in Support Vector Machines

The selection and learning of kernel functions is a very important but rarely studied problem in the field of support vector learning. However, the kernel function of a support vector machine has great influence on its performance. The kernel function projects the dataset from the original data space into the feature space, and therefore the problems which can not be done in low dimensions could be done in a higher dimension through the transform of the kernel function. In this paper, we introduce the gradient descent method into the learning of kernel functions. Using the gradient descent method, we can conduct learning rules of the parameters which indicate the shape and distribution of the kernel functions. Therefore, we can obtain better kernel functions by training of their parameters with respect to the risk minimization principle. The experimental results have shown that our approach can derive better kernel functions and thus has better generalization ability than other methods.

[1]  Chen-Chia Chuang,et al.  A novel approach for the hyperparameters of support vector regression , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[2]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[3]  Jun Wang,et al.  A support vector machine with a hybrid kernel and minimal Vapnik-Chervonenkis dimension , 2004, IEEE Transactions on Knowledge and Data Engineering.

[4]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[5]  R. Nakano,et al.  Yet faster method to optimize SVR hyperparameters based on minimizing cross-validation error , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[6]  Vladimir Vapnik,et al.  Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics) , 1982 .

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Chih-Jen Lin,et al.  Training v-Support Vector Regression: Theory and Algorithms , 2002, Neural Computation.

[9]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[10]  Ryohei Nakano,et al.  Optimizing Support Vector regression hyperparameters based on cross-validation , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[11]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[12]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[13]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[14]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.