An information-theoretic perspective to kernel independent components analysis

In this paper, we investigate the intriguing relationship between information-theoretic learning (ITL), based on weighted Parzen window density estimator, and kernel-based learning algorithms. We prove the equivalence between kernel independent component analysis (kernel ICA) and the Cauchy-Schwartz (C-S) independence measure. This link gives a theoretical motivation for the selection of the Mercer kernel, based on density estimation. Demonstrating this equivalence requires introducing a weighted kernel density estimator, a modification of Parzen windowing. We also discuss the role of the weights in the weighted Parzen windowing and kernel ICA.

[1]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[2]  John W. Fisher,et al.  A novel measure for independent component analysis (ICA) , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[4]  Deniz Erdoğmuş,et al.  Towards a unification of information theoretic learning and kernel methods , 2004, Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004..

[5]  J. Mercer Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[6]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[7]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[8]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Deniz Erdogmus,et al.  Generalized information potential criterion for adaptive system training , 2002, IEEE Trans. Neural Networks.

[11]  Deniz Erdoğmuş,et al.  Blind source separation using Renyi's mutual information , 2001, IEEE Signal Processing Letters.