Learning with Uncertain Kernel Matrix Set

We study support vector machines (SVM) for which the kernel matrix is not specified exactly and it is only known to belong to a given uncertainty set. We consider uncertainties that arise from two sources: (i) data measurement uncertainty, which stems from the statistical errors of input samples; (ii) kernel combination uncertainty, which stems from the weight of individual kernel that needs to be optimized in multiple kernel learning (MKL) problem. Much work has been studied, such as uncertainty sets that allow the corresponding SVMs to be reformulated as semi-definite programs (SDPs), which is very computationally expensive however. Our focus in this paper is to identify uncertainty sets that allow the corresponding SVMs to be reformulated as second-order cone programs (SOCPs), since both the worst case complexity and practical computational effort required to solve SOCPs is at least an order of magnitude less than that needed to solve SDPs of comparable size. In the main part of the paper we propose four uncertainty sets that meet this criterion. Experimental results are presented to confirm the validity of these SOCP reformulations.

[1]  Stephen P. Boyd,et al.  Applications of second-order cone programming , 1998 .

[2]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[3]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[4]  J. Sturm Similarity and other spectral relations for symmetric cones , 2000 .

[5]  I. Olkin,et al.  Multivariate Chebyshev Inequalities , 1960 .

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  Ivor W. Tsang,et al.  Efficient hyperkernel learning using second-order cone programming , 2004, IEEE Transactions on Neural Networks.

[8]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[9]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[10]  Ioana Popescu,et al.  Optimal Inequalities in Probability Theory: A Convex Optimization Approach , 2005, SIAM J. Optim..

[11]  Gregory Z. Grudic,et al.  Sparse Greedy Minimax Probability Machine Classification , 2003, NIPS.

[12]  Olivier Bousquet,et al.  On the Complexity of Learning the Kernel Matrix , 2002, NIPS.

[13]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[14]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[15]  Shie-Jue Lee,et al.  Improving efficiency of multi-kernel learning for support vector machines , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[16]  Donald Goldfarb,et al.  Second-order cone programming , 2003, Math. Program..

[17]  Zhang Bo,et al.  Relationship between support vector set and kernel functions in SVM , 2002 .

[18]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[19]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[20]  Manik Varma,et al.  More generality in efficient multiple kernel learning , 2009, ICML '09.

[21]  Donald Goldfarb,et al.  Robust convex quadratically constrained programs , 2003, Math. Program..

[22]  O. Chapelle Second order optimization of kernel parameters , 2008 .

[23]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[24]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[25]  Hans D. Mittelmann,et al.  An independent benchmarking of SDP and SOCP solvers , 2003, Math. Program..

[26]  Arkadi Nemirovski,et al.  Robust Convex Optimization , 1998, Math. Oper. Res..

[27]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[28]  Kim-Chuan Toh,et al.  SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .