论文信息 - Variational Foundations of Online Backpropagation

Variational Foundations of Online Backpropagation

On-line Backpropagation has become very popular and it has been the subject of in-depth theoretical analyses and massive experimentation. Yet, after almost three decades from its publication, it is still surprisingly the source of tough theoretical questions and of experimental results that are somewhat shrouded in mystery. Although seriously plagued by local minima, the batch-mode version of the algorithm is clearly posed as an optimization problem while, in spite of its effectiveness, in many real-world problems the on-line mode version has not been given a clean formulation, yet. Using variational arguments, in this paper, the on-line formulation is proposed as the minimization of a classic functional that is inspired by the principle of minimal action in analytic mechanics. The proposed approach clashes sharply with common interpretations of on-line learning as an approximation of batch-mode, and it suggests that processing data all at once might be just an artificial formulation of learning that is hopeless in difficult real-world problems.

Marco Gori | Salvatore Frandina | Marco Lippi | Marco Maggini | Stefano Melacci

[1] Marco Gori,et al. Optimal convergence of on-line backpropagation , 1996, IEEE Trans. Neural Networks.

[2] Petr Jizba,et al. Quantum mechanics of the damped harmonic oscillator , 2002 .

[3] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[4] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[5] L. Herrera,et al. A variational principle and the classical and quantum mechanics of the damped harmonic oscillator , 1986 .

[6] Marcello Sanguineti,et al. Learning with Boundary Conditions , 2013, Neural Computation.

[7] Alberto Tesi,et al. On the Problem of Local Minima in Backpropagation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[8] George M. Siouris,et al. Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[9] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[10] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .