Multiple Kernel Learning by Conditional Entropy Minimization

Kernel methods have been successfully used in many practical machine learning problems. Choosing a suitable kernel is left to the practitioner. A common way to an automatic selection of optimal kernels is to learn a linear combination of element kernels. In this paper, a novel framework of multiple kernel learning is proposed based on conditional entropy minimization criterion. For the proposed framework, three multiple kernel learning algorithms are derived. The algorithms are experimentally shown to be comparable to or outperform kernel Fisher discriminant analysis and other multiple kernel learning algorithms on benchmark data sets.

[1]  Jacob Goldberger,et al.  ICA based on a Smooth Estimation of the Differential Entropy , 2008, NIPS.

[2]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[3]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[4]  William Stafford Noble,et al.  Nonstationary kernel combination , 2006, ICML.

[5]  Stephen P. Boyd,et al.  Optimal kernel selection in Kernel Fisher discriminant analysis , 2006, ICML.

[6]  Mehryar Mohri,et al.  Learning Non-Linear Combinations of Kernels , 2009, NIPS.

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Hideitsu Hino,et al.  A Conditional Entropy Minimization Criterion for Dimensionality Reduction and Multiple Kernel Learning , 2010, Neural Computation.

[9]  David Haussler,et al.  Using the Fisher Kernel Method to Detect Remote Protein Homologies , 1999, ISMB.

[10]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[11]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[12]  Hideitsu Hino,et al.  An Information Theoretic Perspective of the Sparse Coding , 2009, ISNN.

[13]  Melanie Hilario,et al.  Margin and Radius Based Multiple Kernel Learning , 2009, ECML/PKDD.

[14]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[15]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..