Levenberg-Marquardt algorithm with adaptive momentum for the efficient training of feedforward networks

We present a highly efficient second order algorithm for the training of feedforward neural networks. The algorithm is based on iterations of the form employed in the Levenberg-Marquardt (LM) method for nonlinear least squares problems with the inclusion of an additional adaptive momentum term arising from the formulation of the training task as a constrained optimization problem. Its implementation requires minimal additional computations compared to a standard LM iteration which are compensated, however, from its excellent convergence properties. Simulations of large scale classical neural network benchmarks are presented which reveal the power of the method to obtain solutions in difficult problems whereas other standard second order techniques (including LM) fail to converge.

[1]  Dimitris A. Karras,et al.  An efficient constrained learning algorithm with momentum acceleration , 1995, Neural Networks.

[2]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[3]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[4]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[5]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[6]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[7]  Philip E. Gill,et al.  Practical optimization , 1981 .

[8]  Jorge Nocedal,et al.  Global Convergence Properties of Conjugate Gradient Methods for Optimization , 1992, SIAM J. Optim..

[9]  Kenneth Levenberg A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[10]  John G. Taylor,et al.  Dynamics of multilayer networks in the vicinity of temporary minima , 1999, Neural Networks.

[11]  Roberto Battiti,et al.  Accelerated Backpropagation Learning: Two Optimization Methods , 1989, Complex Syst..

[12]  Terrence J. Sejnowski,et al.  Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.

[13]  Stavros J. Perantonis,et al.  Efficient perceptron learning using constrained steepest descent , 2000, Neural Networks.