Dual Gradient Descent Algorithm on Two-Layered Feed-Forward Artificial Neural Networks

The learning algorithms of multilayered feed-forward networks can be classified into two categories, gradient and non-gradient kinds. The gradient descent algorithms like backpropagation (BP) or its variations are widely used in many application areas because of convenience. However, the most serious problem associated with the BP is local minima problem. We propose an improved gradient descent algorithm intended to weaken the local minima problem without doing any harm to simplicity of the gradient descent method. This algorithm is called dual gradient learning algorithm in which the upper connections (hidden-to-output) and the lower connections (input-to-hidden) separately evaluated and trained. To do so, the target values of hidden layer units are introduced to be used as evaluation criteria of the lower connections. Simulations on some benchmark problems and a real classification task have been performed to demonstrate the validity of the proposed method.

[1]  Chuan Wang,et al.  Training neural networks with additive noise in the desired signal , 1999, IEEE Trans. Neural Networks.

[2]  Griff L. Bilbro Fast stochastic global optimization , 1994, IEEE Trans. Syst. Man Cybern..

[3]  P. Pardalos,et al.  Handbook of global optimization , 1995 .

[4]  Andrew Luk,et al.  A hybrid algorithm of weight evolution and generalized back-propagation for finding global minimum , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[5]  Xiao-Hu Yu,et al.  Can backpropagation error surface not have local minima , 1992, IEEE Trans. Neural Networks.

[6]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[7]  George D. Magoulas,et al.  Deterministic nonmonotone strategies for effective training of multilayer perceptrons , 2002, IEEE Trans. Neural Networks.

[8]  M. N. Vrahatis,et al.  A class of gradient unconstrained minimization algorithms with adaptive stepsize - some corrections , 2000 .

[9]  Arnold Neumaier,et al.  Global Optimization by Multilevel Coordinate Search , 1999, J. Glob. Optim..

[10]  X H Yu,et al.  On the local minima free condition of backpropagation learning , 1995, IEEE Trans. Neural Networks.

[11]  Joel W. Burdick,et al.  Global descent replaces gradient descent to avoid local minima problem in learning with artificial neural networks , 1993, IEEE International Conference on Neural Networks.

[12]  I.N. Jordanov,et al.  Local minima free neural network learning , 2004, 2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings (IEEE Cat. No.04EX791).

[13]  Aimo A. Törn,et al.  Topographical global optimization using pre-sampled points , 1994, J. Glob. Optim..