Training Feed-forward Neural Networks Using the Gradient Descent Method with the Optimal Stepsize

The most widely used algorithm for training multiplayer feedforward networks, Error BackPropagation (EBP), is an iterative gradient descend algorithm by nature. Variable stepsize is the key to fast convergence of BP networks. A new optimal stepsize algorithm is proposed for accelerating the training process. It modies the objective function to reduce the computational complexity of the Jacobin and consequently that of Hessian matrices, and hereby directly computes the optimal iterative stepsize. The improved backpropagation algorithm helps alleviating the problem of slow convergence and oscillations. The analysis indicates that the backpropagation with optimal stepsize (BPOS) is more ecient when treating large-scale samples. The numerical experiment results on pattern recognition and function approximation problems show that the proposed algorithm possesses the features of fast convergence and less intensive computational complexity.

[1]  Hiroshi Yabe,et al.  Extended Barzilai-Borwein method for unconstrained minimization problems , 2010 .

[2]  M.N. Vrahatis,et al.  Parallel tangent methods with variable stepsize , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[3]  Ya-xiang,et al.  A NEW STEPSIZE FOR THE STEEPEST DESCENT METHOD , 2006 .

[4]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[5]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[6]  George D. Magoulas,et al.  Effective Backpropagation Training with Variable Stepsize , 1997, Neural Networks.

[7]  Dilip Sarkar,et al.  Methods to speed up error back-propagation learning algorithm , 1995, CSUR.

[8]  R. Hecht-Nielsen,et al.  Theory of the Back Propagation Neural Network , 1989 .

[9]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[10]  Jorge J. Moré,et al.  Testing Unconstrained Optimization Software , 1981, TOMS.

[11]  Okyay Kaynak,et al.  An algorithm for fast convergence in training neural networks , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[12]  M. N. Vrahatis,et al.  Adaptive stepsize algorithms for on-line training of neural networks , 2001 .

[13]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[14]  Andreas Griewank,et al.  Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.

[15]  Grgoire Montavon,et al.  Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.

[16]  Gerd Hirzinger,et al.  Why feed-forward networks are in a bad shape , 1998 .

[17]  Ya-Xiang Yuan,et al.  On the Quadratic Convergence of the Levenberg-Marquardt Method without Nonsingularity Assumption , 2005, Computing.

[18]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.