Maximum Entropy and Minimal Mutual Information in a Nonlinear Model

In blind source separation, two different separation techniques are mainly used: Minimal Mutual Information (MMI), where minimization of the mutual output information yields an independent random vector, and Maximum Entropy (ME), where the output entropy is maximized. However, it is yet unclear why ME should solve the separation problem, ie. result in an independent vector. Amari has given a partial confirmation for ME in the linear case in [1], where he proves that under the assumption of vanishing expectancy of the sources ME does not change the solutions of MMI up to scaling and permutation. In this paper, we generalize Amari’s approach to nonlinear ICA problems, where random vectors have been mixed by output functions of layered neural networks. We show that certain solution points of MMI are kept fixed by ME if no scaling of the weight vectors is allowed. In general, ME however might leave those MMI solutions using diagonal weights in the first network layer. Therefore, we conclude this paper by suggesting that in nonlinear ME algorithms diagonal weights should be fixed in later epochs.

[1]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[2]  Christian Jutten,et al.  Source separation in post-nonlinear mixtures , 1999, IEEE Trans. Signal Process..

[3]  Andrzej Cichocki,et al.  Information-theoretic approach to blind separation of sources in non-linear mixture , 1998, Signal Process..

[4]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[5]  J. Nadal,et al.  Nonlinear neurons in the low-noise limit: a factorial code maximizes information transfer Network 5 , 1994 .

[6]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[7]  A. Hyvarinen,et al.  On existence and uniqueness of solutions in nonlinear independent component analysis , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[8]  Ralph Linsker,et al.  Local Synaptic Learning Rules Suffice to Maximize Mutual Information in a Linear Network , 1992, Neural Computation.

[9]  Ralph Linsker,et al.  An Application of the Principle of Maximum Information Preservation to Linear Systems , 1988, NIPS.

[10]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[11]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[12]  Te-Won Lee,et al.  Nonlinear approaches to Independent Component Analysis , 2000 .

[13]  Shun-ichi Amari,et al.  Adaptive Online Learning Algorithms for Blind Separation: Maximum Entropy and Minimum Mutual Information , 1997, Neural Computation.

[14]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[15]  Te-Won Lee,et al.  Independent Component Analysis , 1998, Springer US.