Deterministic Convergence of an Online Gradient Method with Momentum

An online gradient method with momentum for feedforward neural network is considered. The learning rate is set to be a constant and the momentum coefficient an adaptive variable. Both the weak and strong convergence results are proved, as well as the convergence rates for the error function and for the weight.

[1]  J. G. Taylor,et al.  Mathematical Approaches to Neural Networks , 1993 .

[2]  Xin Li,et al.  Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions , 2002, Adv. Comput. Math..

[3]  Manabu Torii,et al.  Stability of steepest descent with momentum for quadratic functions , 2002, IEEE Trans. Neural Networks.

[4]  Zhi-Quan Luo,et al.  On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks , 1991, Neural Computation.

[5]  M.H. Hassoun,et al.  Fundamentals of Artificial Neural Networks , 1996, Proceedings of the IEEE.

[6]  Wei Wu,et al.  Deterministic convergence of an online gradient method for BP neural networks , 2005, IEEE Transactions on Neural Networks.

[7]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[8]  Luo Zhi-quan,et al.  Analysis of an approximate gradient projection method with applications to the backpropagation algorithm , 1994 .

[9]  Alexei A. Gaivoronski,et al.  Convergence properties of backpropagation for neural nets via theory of stochastic gradient methods. Part 1 , 1994 .

[10]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[11]  O. Mangasarian,et al.  Serial and parallel backpropagation convergence via nonmonotone perturbed minimization , 1994 .

[12]  Wei Wu,et al.  Convergence of an online gradient method for feedforward neural networks with stochastic inputs , 2004 .

[13]  Wei Wu,et al.  Convergence of gradient method with momentum for two-Layer feedforward neural networks , 2006, IEEE Transactions on Neural Networks.

[14]  Wei Wu,et al.  Deterministic convergence of an online gradient method for neural networks , 2002 .

[15]  David Barber,et al.  Online Learning from Finite Training Sets and Robustness to Input Bias , 1998, Neural Computation.

[16]  Eugenius Kaszkurewicz,et al.  Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method , 2004, Neural Networks.