A SPECTRAL VERSION OF PERRY'S CONJUGATE GRADIENT METHOD FOR NEURAL NETWORK TRAINING

In this work, an ecient training algorithm for feedforward neural networks is presented. It is based on a scaled version of the conjugate gradient method suggested by Perry, which employs the spectral steplength of Barzilai and Borwein that contains second order information without estimating the Hessian matrix. The learning rate is automatically adapted at each epoch, using the conjugate gradient values and the learning rate of the previous one. In addition, a new acceptability criterion for the learning rate is utilized based on non-monotone Wolfe conditions. The eciency of the training algorithm is proved on the standard tests, including XOR, 3-bit parity, font learning and function approximation problems.

[1]  Wenyu Sun,et al.  Global convergence of nonmonotone descent methods for unconstrained optimization problems , 2002 .

[2]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[3]  Bernard Widrow,et al.  Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[4]  A. K. Rigler,et al.  Accelerating the convergence of the back-propagation method , 1988, Biological Cybernetics.

[5]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[6]  David F. Shanno,et al.  Algorithm 500: Minimization of Unconstrained Multivariate Functions [E4] , 1976, TOMS.

[7]  A. Perry A Modified Conjugate Gradient Algorithm for Unconstrained Nonlinear Optimization , 1975 .

[8]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[9]  J. M. Martínez,et al.  A Spectral Conjugate Gradient Method for Unconstrained Optimization , 2001 .

[10]  Vassilis P. Plagianakos,et al.  A Nonmonotone Backpropagation Training Method for Neural Networks , 1998 .

[11]  Vassilis P. Plagianakos,et al.  Automatic Adaptation of Learning Rate for Backpropagation Neural Networks , 1998 .

[12]  Marcos Raydan,et al.  The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem , 1997, SIAM J. Optim..

[13]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[14]  D. E. Rumelhart,et al.  chapter Parallel Distributed Processing, Exploration in the Microstructure of Cognition , 1986 .