Multiple kernel learning and feature space denoising

We review a multiple kernel learning (MKL) technique called lp regularised multiple kernel Fisher discriminant analysis (MK-FDA), and investigate the effect of feature space denoising on MKL. Experiments show that with both the original kernels or denoised kernels, ip MK-FDA outperforms its fixed-norm counterparts. Experiments also show that feature space denoising boosts the performance of both single kernel FDA and £p MK-FDA, and that there is a positive correlation between the learnt kernel weights and the amount of variance kept by feature space denoising. Based on these observations, we argue that in the case where the base feature spaces are noisy, linear combination of kernels cannot be optimal. An MKL objective function which can take care of feature space denoising automatically, and which can learn a truly optimal (non-linear) combination of the base kernels, is yet to be found.

[1]  Stephen P. Boyd,et al.  Optimal kernel selection in Kernel Fisher discriminant analysis , 2006, ICML.

[2]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[4]  Jieping Ye,et al.  Multi-label Multiple Kernel Learning , 2008, NIPS.

[5]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[6]  Jieping Ye,et al.  Multi-class Discriminant Kernel Learning via Convex Programming , 2008, J. Mach. Learn. Res..

[7]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[8]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Hongping Cai,et al.  ℓp norm multiple kernel Fisher discriminant analysis for object and image categorisation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Josef Kittler,et al.  Non-sparse Multiple Kernel Learning for Fisher Discriminant Analysis , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[12]  Sebastian Mika,et al.  Kernel Fisher Discriminants , 2003 .

[13]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[14]  Kenneth O. Kortanek,et al.  Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..

[15]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[16]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[17]  Cheng Soon Ong,et al.  Multiclass multiple kernel learning , 2007, ICML '07.

[18]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[19]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[20]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .