Kernel independent component analysis

We present a class of algorithms for independent component analysis (ICA) which use contrast functions based on canonical correlations in a reproducing kernel Hilbert space. On the one hand, we show that our contrast functions are related to mutual information and have desirable mathematical properties as measures of statistical dependence. On the other hand, building on recent developments in kernel methods, we show that these criteria and their derivatives can be computed efficiently. Minimizing these criteria leads to flexible and robust algorithms for ICA. We illustrate with simulations involving a wide variety of source distributions, showing that our algorithms outperform many of the presently known algorithms.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  Andrei N. Kolmogorov,et al.  On the Shannon theory of information transmission in the case of continuous signals , 1956, IRE Trans. Inf. Theory.

[3]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[4]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[5]  H. Widom Asymptotic behavior of the eigenvalues of certain integral equations , 1963 .

[6]  H. Widom Asymptotic behavior of the eigenvalues of certain integral equations. II , 1964 .

[7]  J. Kettenring,et al.  Canonical Analysis of Several Sets of Variables , 2022 .

[8]  Gene H. Golub,et al.  Matrix computations , 1983 .

[9]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[10]  Saburou Saitoh,et al.  Theory of Reproducing Kernels and Its Applications , 1988 .

[11]  A. Buja Remarks on Functional Canonical Variates, Alternating Least Squares Methods and Ace , 1990 .

[12]  R. Durrett Probability: Theory and Examples , 1993 .

[13]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[14]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[15]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[16]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[17]  B. Silverman,et al.  Canonical correlation analysis when the data are curves. , 1993 .

[18]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[19]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[20]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[21]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[22]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[23]  Aapo Hyvärinen,et al.  A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[24]  H. Knutsson,et al.  Learning Canonical Correlations , 1997 .

[25]  Philippe Garat,et al.  Blind separation of mixture of independent sources through a quasi-maximum likelihood approach , 1997, IEEE Trans. Signal Process..

[26]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[27]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[28]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[29]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[30]  Jean-François Cardoso,et al.  Multidimensional independent component analysis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[31]  Stephen J. Wright Modified Cholesky Factorizations in Interior-Point Algorithms for Linear Programming , 1999, SIAM J. Optim..

[32]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[33]  Terrence J. Sejnowski,et al.  Independent Component Analysis Using an Extended Infomax Algorithm for Mixed Subgaussian and Supergaussian Sources , 1999, Neural Computation.

[34]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[35]  G. Micula,et al.  Numerical Treatment of the Integral Equations , 1999 .

[36]  Jean-Franois Cardoso High-Order Contrasts for Independent Component Analysis , 1999, Neural Computation.

[37]  Christopher K. I. Williams,et al.  The Effect of the Input Density Distribution on Kernel-based Classifiers , 2000, ICML.

[38]  B. Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, ICML.

[39]  Christopher K. I. Williams,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[40]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[41]  Motoaki Kawanabe,et al.  Kernel Feature Spaces and Nonlinear Blind Souce Separation , 2001, NIPS.

[42]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[43]  Horst Bischof,et al.  Nonlinear Feature Extraction Using Generalized Canonical Correlation Analysis , 2001, ICANN.

[44]  Max Welling,et al.  A Constrained EM Algorithm for Independent Component Analysis , 2001, Neural Computation.

[45]  Nikos A. Vlassis,et al.  Efficient source adaptivity in independent component analysis , 2001, IEEE Trans. Neural Networks.

[46]  E. Oja,et al.  Independent Component Analysis , 2013 .

[47]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[48]  Michael I. Jordan,et al.  Tree-dependent Component Analysis , 2002, UAI.

[49]  Nello Cristianini,et al.  Latent Semantic Kernels , 2001, Journal of Intelligent Information Systems.

[50]  Rene F. Swarttouw,et al.  Orthogonal polynomials , 2020, NIST Handbook of Mathematical Functions.