Convergent on-line algorithms for supervised learning in neural networks

In this paper we define on-line algorithms for neural-network training, based on the construction of multiple copies of the network, which are trained by employing different data blocks. It is shown that suitable training algorithms can be defined, in a way that the disagreement between the different copies of the network is asymptotically reduced, and convergence toward stationary points of the global error function can be guaranteed. Relevant features of the proposed approach are that the learning rate must be not necessarily forced to zero and that real-time learning is permitted.

[1]  Roberto Battiti,et al.  BFGS Optimization for Faster and Automated Supervised Learning , 1990 .

[2]  Alexei A. Gaivoronski,et al.  Convergence properties of backpropagation for neural nets via theory of stochastic gradient methods. Part 1 , 1994 .

[3]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[4]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[5]  Timothy Masters,et al.  Advanced algorithms for neural networks: a C++ sourcebook , 1995 .

[6]  Andrzej Cichocki,et al.  Neural networks for optimization and signal processing , 1993 .

[7]  Luigi Grippo,et al.  A class of unconstrained minimization methods for neural network training , 1994 .

[8]  Zhi-Quan Luo,et al.  On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks , 1991, Neural Computation.

[9]  L. Grippo,et al.  A New Class of Augmented Lagrangians in Nonlinear Programming , 1979 .

[10]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[11]  Paul J. Werbos,et al.  Supervised Learning: Can it Escape its Local Minimum? , 1994 .

[12]  Luigi Grippof,et al.  Globally convergent block-coordinate techniques for unconstrained optimization , 1999 .

[13]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[14]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[15]  E. Spedicato Algorithms for continuous optimization : the state of the art , 1994 .

[16]  Luigi Grippo,et al.  GLOBALLY CONVERGENT ONLINE MINIMIZATION ALGORITHMS FOR NEURAL NETWORK TRAINING , 1996 .

[17]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[18]  H. White Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models , 1989 .

[19]  Timothy Masters,et al.  Advanced algorithms for neural networks: a C++ sourcebook , 1995 .

[20]  Dimitri P. Bertsekas,et al.  A New Class of Incremental Gradient Methods for Least Squares Problems , 1997, SIAM J. Optim..

[21]  Simon Haykin,et al.  Neural networks , 1994 .

[22]  O. Mangasarian,et al.  Serial and parallel backpropagation convergence via nonmonotone perturbed minimization , 1994 .

[23]  Marco Gori,et al.  Optimal convergence of on-line backpropagation , 1996, IEEE Trans. Neural Networks.

[24]  Adrian J. Shepherd,et al.  Second-Order Methods for Neural Networks , 1997 .

[25]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[26]  S. Lucidi,et al.  On Exact Augmented Lagrangian Functions in Nonlinear Programming , 1996 .

[27]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[28]  Luo Zhi-quan,et al.  Analysis of an approximate gradient projection method with applications to the backpropagation algorithm , 1994 .

[29]  L. C. W. Dixon,et al.  Neural Networks and Unconstrained Optimization , 1994 .

[30]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[31]  Paul Tseng,et al.  An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule , 1998, SIAM J. Optim..