Stochastic dynamics of supervised learning

The stochastic evolution of adiabatic (slow) backpropagation training of a neural network is discussed and a Fokker-Planck equation for the post-training distribution function in the network space is derived. The distribution obtained differs from the one given by Radons et al. (1990). Studying the character of the post-training distribution, the authors find that, except under very special circumstances, the distribution will be non-Gibbsian. The validity of the present approach is tested on a simple backpropagation learning system in one dimension, which can be solved analytically as well. Implications of the Fokker-Planck approach for general situations are examined in the local linear approximation. Surprisingly they find that the post-training distribution is isotropic close to its peak, hence simpler than the corresponding Gibbs distribution.