Information Theoretic Learning with Infinitely Divisible Kernels

In this paper, we develop a framework for information theoretic learning based on infinitely divisible matrices. We formulate an entropy-like functional on positive definite matrices based on Renyi's axiomatic definition of entropy and examine some key properties of this functional that lead to the concept of infinite divisibility. The proposed formulation avoids the plug in estimation of density and brings along the representation power of reproducing kernel Hilbert spaces. As an application example, we derive a supervised metric learning algorithm using a matrix based analogue to conditional entropy achieving results comparable with the state of the art.

[1]  Wei Yang,et al.  Fast neighborhood component analysis , 2012, Neurocomputing.

[2]  S. Friedland Convex spectral functions , 1981 .

[3]  K. Brown,et al.  Graduate Texts in Mathematics , 1982 .

[4]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[5]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[6]  José Carlos Príncipe,et al.  A Reproducing Kernel Hilbert Space Framework for Information-Theoretic Learning , 2008, IEEE Transactions on Signal Processing.

[7]  Roger A. Horn,et al.  The theory of infinitely divisible matrices and kernels , 1969 .

[8]  Luis Filipe Coelho Antunes,et al.  Conditional Rényi Entropies , 2012, IEEE Transactions on Information Theory.

[9]  I. J. Schoenberg,et al.  Metric spaces and positive definite functions , 1938 .

[10]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[11]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[12]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[13]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[14]  Jose C. Principe,et al.  Information Theoretic Learning - Renyi's Entropy and Kernel Perspectives , 2010, Information Theoretic Learning.

[15]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[16]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[17]  José Carlos Príncipe,et al.  A Unified Framework for Quadratic Measures of Independence , 2011, IEEE Transactions on Signal Processing.

[18]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[19]  A. S. Lewis,et al.  Derivatives of Spectral Functions , 1996, Math. Oper. Res..