Neural network constructive algorithms: Trading generalization for learning efficiency?

There are currently several types of constructive, (or growth), algorithms available for training a feed-forward neural network. This paper describes and explains the main ones, using a fundamental approach to the multi-layer perceptron problem-solving mechanisms. The claimed convergence properties of the algorithms are verified using just two mapping theorems, which consequently enables all the algorithms to be unified under a basic mechanism. The algorithms are compared and contrasted and the deficiencies of some highlighted. The fundamental reasons for the actual success of these algorithms are extracted, and used to suggest where they might most fruitfully be applied. A suspicion that they are not a panacea for all current neural network difficulties, and that one must somewhere along the line pay for the learning efficiency they promise, is developed into an argument that their generalization abilities will lie on average below that of back-propagation.

[1]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[2]  Olvi L. Mangasarian,et al.  Multisurface method of pattern separation , 1968, IEEE Trans. Inf. Theory.

[3]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  Lawrence D. Jackel,et al.  Large Automatic Learning, Rule Extraction, and Generalization , 1987, Complex Syst..

[6]  Gerald Tesauro,et al.  Scaling Relationships in Back-Propagation Learning: Dependence on Training Set Size , 1987, Complex Syst..

[7]  Terrence J. Sejnowski,et al.  NETtalk: a parallel network that learns to read aloud , 1988 .

[8]  Gerald Tesauro,et al.  Scaling Relationships in Back-propagation Learning , 1988, Complex Syst..

[9]  Gerald Tesauro,et al.  A study of scaling and generalization in neural networks , 1988, Neural Networks.

[10]  Frank J. Smieja,et al.  Hard Learning the Easy Way: Backpropagation with Deformation , 1988, Complex Syst..

[11]  Timur Ash,et al.  Dynamic node creation in backpropagation networks , 1989 .

[12]  Jean-Pierre Nadal,et al.  Study of a Growth Algorithm for a Feedforward Network , 1989, Int. J. Neural Syst..

[13]  Mario Marchand,et al.  Learning by Minimizing Resources in Neural Networks , 1989, Complex Syst..

[14]  Geoffrey E. Hinton,et al.  Learning distributed representations of concepts. , 1989 .

[15]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[16]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[17]  H. Shvaytser Even simple neural nets cannot be trained reliably with a polynomial number of examples , 1989, International 1989 Joint Conference on Neural Networks.

[18]  Alexander H. Waibel,et al.  Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[19]  Heinz Mühlenbein,et al.  The geometry of multi-layer perceptron solutions , 1990, Parallel Comput..

[20]  Stephen I. Gallant,et al.  Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.

[21]  Heinz Mühlenbein,et al.  Limitations of multi-layer perceptron networks-steps towards genetic neural networks , 1990, Parallel Comput..

[22]  Marcus Frean,et al.  The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[23]  Isabelle Guyon Applications of Neural Networks to Character Recognition , 1991, Int. J. Pattern Recognit. Artif. Intell..

[24]  Robert M. French,et al.  Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks , 1991 .

[25]  Frank J. Smieja Hyperplane \spin" Dynamics, Network Plasticity and Back-propagation Learning , 1991 .

[26]  F. J. Śmieja,et al.  Multiple Network Systems (Minos) Modules: Task Division and Module Discrimination , 1991 .

[27]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.