A General and Efficient Multiple Kernel Learning Algorithm

While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lankriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constraint quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental results show that the proposed algorithm helps for automatic model selection, improving the interpretability of the learning result and works for hundred thousands of examples or hundreds of kernels to be combined.

[1]  Gunnar Rätsch,et al.  Learning Interpretable SVMs for Biological Sequence Classification , 2006, BMC Bioinformatics.

[2]  Gunnar Rätsch,et al.  Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces , 2002, Machine Learning.

[3]  Gunnar Rätsch,et al.  Learning Interpretable SVMs for Biological Sequence Classification , 2005, BMC Bioinformatics.

[4]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[5]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[6]  Alexander J. Smola,et al.  Hyperkernels , 2002, NIPS.

[7]  Gunnar Rätsch,et al.  NIPS workshop on New Problems and Methods in Computational Biology , 2007, BMC Bioinformatics.

[8]  Jinbo Bi,et al.  Column-generation boosting methods for mixture of kernels , 2004, KDD.

[9]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[10]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[11]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[12]  Kenneth O. Kortanek,et al.  Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..

[13]  G. Rätsch Robust Boosting via Convex Optimization , 2001 .

[14]  Kristin P. Bennett,et al.  MARK: a boosting algorithm for heterogeneous kernel models , 2002, KDD.

[15]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[16]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[17]  Gunnar Rätsch,et al.  Mar ginal Boosting 1 , 2001 .

[18]  HettichR.,et al.  Semi-infinite programming , 1979 .

[19]  Yves Grandvalet,et al.  Adaptive Scaling for Feature Selection in SVMs , 2002, NIPS.

[20]  Gunnar Rätsch,et al.  An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.