Unsupervised learning for blind source separation: an information-theoretic approach

This paper provides a detailed and rigorous analysis of the two commonly used methods for redundancy reduction: linear independent component analysis (ICA) and information maximization (InfoMax). The paper shows analytically that ICA based on the Kullback-Leibler information as a mutual information measure and InfoMax lead to the same solution if the parameterization of the output nonlinear functions in the latter method is sufficiently rich. Furthermore, this work discusses the alternative redundancy measures not based on the Kullback-Leibler information distance and nonlinear ICA. The practical issues of applying ICA and InfoMax are also discussed.

[1]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[2]  A. Norman Redlich,et al.  Redundancy Reduction as a Strategy for Unsupervised Learning , 1993, Neural Computation.

[3]  Gustavo Deco,et al.  Nonlinear higher-order statistical decorrelation by volume-conserving neural architectures , 1995, Neural Networks.

[4]  G. Deco,et al.  An Information-Theoretic Approach to Neural Computing , 1997, Perspectives in Neural Computing.

[5]  Deco,et al.  Learning time series evolution by unsupervised extraction of correlations. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[6]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[7]  A. Norman Redlich,et al.  Supervised Factorial Learning , 1993, Neural Computation.

[8]  Joseph J. Atick,et al.  Towards a Theory of Early Visual Processing , 1990, Neural Computation.

[9]  Gustavo Deco,et al.  An information theory based learning paradigm for linear feature extraction , 1996, Neurocomputing.

[10]  Joseph J. Atick,et al.  What Does the Retina Know about Natural Scenes? , 1992, Neural Computation.

[11]  Gustavo Deco,et al.  Linear redundancy reduction learning , 1995, Neural Networks.

[12]  H. B. Barlow,et al.  Finding Minimum Entropy Codes , 1989, Neural Computation.

[13]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[14]  L. Parra,et al.  Redundancy reduction with information-preserving nonlinear maps , 1995 .

[15]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[16]  G. Deco,et al.  Linear feature extraction in networks with lateral connections , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).