Terminal attractor algorithms: A critical analysis

Abstract One of the fundamental drawbacks of learning by gradient descent techniques is the susceptibility to local minima during training. Recently, some authors have independently introduced new learning algorithms that are based on the properties of terminal attractors and repellers. These algorithms were claimed to perform global optimization of the cost in finite time, provided that a null solution exists. In this paper, we prove that, in the case of local minima free error functions, terminal attractor algorithms guarantee that the optimal solution is reached in a number of steps that is independent of the cost function. Moreover, in the case of multimodal functions, we prove that, unfortunately, there are no theoretical guarantees that a global solution can be reached or that the algorithms perform satisfactorily from an operational point of view, unless particular favourable conditions are satisfied. On the other hand, the ideas behind these innovative methods are very interesting and deserve further investigations.

[1]  Michail Zak,et al.  Terminal attractors in neural networks , 1989, Neural Networks.

[2]  Alan F. Murray,et al.  IEEE International Conference on Neural Networks , 1997 .

[3]  J. Stephen Judd,et al.  Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.

[4]  Hervé Bourlard,et al.  Speech pattern discrimination and multilayer perceptrons , 1989 .

[5]  Alberto Tesi,et al.  On the Problem of Local Minima in Backpropagation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jinhui Chao,et al.  How to find global minima in finite times of search for multilayer perceptrons training , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[7]  Ching-Chi Hsu,et al.  Terminal attractor learning algorithms for back propagation neural networks , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[8]  Marco Gori,et al.  On the problem of local minima in recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[9]  Xiao-Hu Yu,et al.  Can backpropagation error surface not have local minima , 1992, IEEE Trans. Neural Networks.

[10]  Chi-Ping Tsang,et al.  On the Convergence of Feed Forward Neural Networks Incorporating Terminal Attractors , 1993 .

[11]  Bedri C. Cetin,et al.  Terminal repeller unconstrained subenergy tunneling (trust) for fast global optimization , 1993 .

[12]  Joel W. Burdick,et al.  Global descent replaces gradient descent to avoid local minima problem in learning with artificial neural networks , 1993, IEEE International Conference on Neural Networks.

[13]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[14]  Aimo A. Törn,et al.  Global Optimization , 1999, Science.

[15]  Norio Baba,et al.  A new approach for finding the global minimum of error function of neural networks , 1989, Neural Networks.

[16]  A. A. Zhigli︠a︡vskiĭ,et al.  Theory of Global Random Search , 1991 .

[17]  M. Zak Terminal attractors for addressable memory in neural networks , 1988 .

[18]  C. P. Tsang,et al.  On the convergence of feedforward neural networks incoporating terminal attractors , 1993, IEEE International Conference on Neural Networks.