Margin and Radius Based Multiple Kernel Learning

A serious drawback of kernel methods, and Support Vector Machines (SVM) in particular, is the difficulty in choosing a suitable kernel function for a given dataset. One of the approaches proposed to address this problem is Multiple Kernel Learning (MKL) in which several kernels are combined adaptively for a given dataset. Many of the existing MKL methods use the SVM objective function and try to find a linear combination of basic kernels such that the separating margin between the classes is maximized. However, these methods ignore the fact that the theoretical error bound depends not only on the margin, but also on the radius of the smallest sphere that contains all the training instances. We present a novel MKL algorithm that optimizes the error bound taking account of both the margin and the radius. The empirical results show that the proposed method compares favorably with other state-of-the-art MKL methods.

[1]  Adrian Smith,et al.  Bayesian Assessment of Network Reliability , 1998, SIAM Rev..

[2]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[3]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[4]  N. Maculan,et al.  Global optimization : from theory to implementation , 2006 .

[5]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[6]  Melanie Hilario,et al.  Knowledge and Information Systems , 2007 .

[7]  Jason Weston,et al.  Trading convexity for scalability , 2006, ICML.

[8]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[9]  Alexandros Kalousis,et al.  NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection , 1999, Intell. Data Anal..

[10]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[11]  Alexander Shapiro,et al.  Optimization Problems with Perturbations: A Guided Tour , 1998, SIAM Rev..

[12]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[13]  Gunnar Rätsch,et al.  A General and Efficient Multiple Kernel Learning Algorithm , 2005, NIPS.

[14]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[15]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[16]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[17]  Koby Crammer,et al.  Kernel Design Using Boosting , 2002, NIPS.

[18]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[19]  Q. Mcnemar Note on the sampling error of the difference between correlated proportions or percentages , 1947, Psychometrika.

[20]  Nello Cristianini,et al.  Editorial: Kernel Methods: Current Research and Future Directions , 2002, Machine-mediated learning.

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[22]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[23]  Olivier Bousquet,et al.  On the Complexity of Learning the Kernel Matrix , 2002, NIPS.

[24]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..