Multi-label Multiple Kernel Learning

We present a multi-label multiple kernel learning (MKL) formulation in which the data are embedded into a low-dimensional space directed by the instance-label correlations encoded into a hypergraph. We formulate the problem in the kernel-induced feature space and propose to learn the kernel matrix as a linear combination of a given collection of kernel matrices in the MKL framework. The proposed learning formulation leads to a non-smooth min-max problem, which can be cast into a semi-infinite linear program (SILP). We further propose an approximate formulation with a guaranteed error bound which involves an unconstrained convex optimization problem. In addition, we show that the objective function of the approximate formulation is differentiable with Lipschitz continuous gradient, and hence existing methods can be employed to compute the optimal solution efficiently. We apply the proposed formulation to the automated annotation of Drosophila gene expression pattern images, and promising results have been reported in comparison with representative algorithms.

[1]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[2]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[3]  M. Ashburner,et al.  Systematic determination of patterns of gene expression during Drosophila embryogenesis , 2002, Genome Biology.

[4]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[5]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[6]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[7]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[8]  Trevor Darrell,et al.  Approximate Correspondences in High Dimensions , 2006, NIPS.

[9]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[10]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[11]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[12]  Jieping Ye,et al.  Automated annotation of Drosophila gene expression patterns using a controlled vocabulary , 2008, Bioinform..

[13]  Serge J. Belongie,et al.  Higher order learning with graphs , 2006, ICML.

[14]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[15]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[16]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Kenneth O. Kortanek,et al.  Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..