Kernel Methods for Measuring Independence

We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the covariance between functions of the random variables in reproducing kernel Hilbert spaces (RKHSs). We prove that when the RKHSs are universal, both functionals are zero if and only if the random variables are pairwise independent. We also show that the kernel mutual information is an upper bound near independence on the Parzen window estimate of the mutual information. Analogous results apply for two correlation-based dependence functionals introduced earlier: we show the kernel canonical correlation and the kernel generalised variance to be independence measures for universal kernels, and prove the latter to be an upper bound on the mutual information near independence. The performance of the kernel dependence functionals in measuring independence is verified in the context of independent component analysis.

[1]  E. Mourier Éléments aléatoires dans un espace de Banach , 1953 .

[2]  Ronald N. Bracewell,et al.  The Fourier Transform and Its Applications , 1966 .

[3]  C. Baker Mutual Information for Gaussian Processes , 1970 .

[4]  C. Baker Joint measures and cross-covariance operators , 1973 .

[5]  David G. Stork,et al.  Pattern Classification , 1973 .

[6]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[7]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[8]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[9]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[10]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[11]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  B. Silverman,et al.  Canonical correlation analysis when the data are curves. , 1993 .

[14]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[15]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[16]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[17]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[18]  A. Hyvärinen,et al.  One-unit contrast functions for independent component analysis: a statistical analysis , 1997 .

[19]  Eric Moulines,et al.  A blind source separation technique using second-order statistics , 1997, IEEE Trans. Signal Process..

[20]  Philippe Garat,et al.  Blind separation of mixture of independent sources through a quasi-maximum likelihood approach , 1997, IEEE Trans. Signal Process..

[21]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[22]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[23]  T. Ens,et al.  Blind signal separation : statistical principles , 1998 .

[24]  Jean-François Cardoso,et al.  Multidimensional independent component analysis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[25]  J. Dauxois,et al.  Nonlinear canonical analysis and independence tests , 1998 .

[26]  Moeness G. Amin,et al.  Blind source separation based on time-frequency signal representations , 1998, IEEE Trans. Signal Process..

[27]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[28]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[29]  A. J. Bell,et al.  A Unifying Information-Theoretic Framework for Independent Component Analysis , 2000 .

[30]  Christian Jutten,et al.  Source separation in post-nonlinear mixtures , 1999, IEEE Trans. Signal Process..

[31]  Colin Fyfe,et al.  Kernel and Nonlinear Canonical Correlation Analysis , 2000, IJCNN.

[32]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[33]  Dinh Tuan Pham,et al.  BLIND SOURCE SEPARATION IN POST NONLINEAR MIXTURES , 2001 .

[34]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[35]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[36]  Nello Cristianini,et al.  Spectral Kernel Methods for Clustering , 2001, NIPS.

[37]  Johan A. K. Suykens,et al.  Kernel Canonical Correlation Analysis and Least Squares Support Vector Machines , 2001, ICANN.

[38]  Xiao-Long Zhu,et al.  Adaptive RLS algorithm for blind source separation using a natural gradient , 2002, IEEE Signal Processing Letters.

[39]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .

[40]  L. Shepp Probability Essentials , 2002 .

[41]  Andrzej Cichocki,et al.  Adaptive blind signal and image processing , 2002 .

[42]  Michael I. Jordan,et al.  Tree-dependent Component Analysis , 2002, UAI.

[43]  C. Jutten,et al.  QUADRATIC DEPENDENCE MEASURE FOR NONLINEAR BLIND SOURCES SEPARATION , 2003 .

[44]  Liam Paninski,et al.  Estimation of Entropy and Mutual Information , 2003, Neural Computation.

[45]  C. Jutten,et al.  On the separability of nonlinear mixtures of temporally correlated sources , 2003, IEEE Signal Processing Letters.

[46]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[47]  Alexander J. Smola,et al.  The kernel mutual information , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[48]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[49]  John W. Fisher,et al.  ICA Using Spacings Estimates of Entropy , 2003, J. Mach. Learn. Res..

[50]  Arthur Gretton Kernel Methods for Classification and Signal Separation (PhD thesis) , 2003 .

[51]  Motoaki Kawanabe,et al.  Kernel-Based Nonlinear Blind Source Separation , 2003, Neural Computation.

[52]  A. Gretton Kernel Methods for Classification and Signal Separation , 2003 .

[53]  Alexander Kraskov,et al.  Least-dependent-component analysis based on mutual information. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[54]  David R. Hardoon,et al.  KCCA for fMRI Analysis , 2004 .

[55]  O. Bousquet,et al.  Kernels, Associated Structures and Generalizations , 2004 .

[56]  Gökhan BakIr,et al.  Multivariate Regression with Stiefel Constraints , 2004 .

[57]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[58]  Gilles Blanchard,et al.  Statistical properties of Kernel Prinicipal Component Analysis , 2019 .

[59]  A. Tsybakov,et al.  Nonparametric independent component analysis , 2004 .

[60]  N. Logothetis,et al.  Behaviour and Convergence of the Constrained Covariance , 2004 .

[61]  Dinh-Tuan Pham,et al.  Fast algorithms for mutual information based independent component analysis , 2004, IEEE Transactions on Signal Processing.

[62]  Kenji Fukumizu,et al.  Consistency of Kernel Canonical Correlation Analysis , 2005 .

[63]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[64]  Fabian J. Theis,et al.  Blind signal separation into groups of dependent signals using joint block diagonalization , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[65]  P. Bickel,et al.  Consistent independent component analysis and prewhitening , 2005, IEEE Transactions on Signal Processing.

[66]  Bernhard Schölkopf,et al.  Kernel Constrained Covariance for Dependence Measurement , 2005, AISTATS.

[67]  Shotaro Akaho,et al.  A kernel method for canonical correlation analysis , 2006, ArXiv.