Modifying the Generalized Delta Rule to Train Networks of Non-monotonic Processors for Pattern Classification

A modification of the generalized delta rule is described that is capable of training multilayer networks of value units, i.e. units defined by a particular non-monotonic activation function, the Gaussian, For simple problems of pattern classification, this rule produces networks with several advantages over standard feedforward networks: they require fewer processing units and can be trained much more quickly. Though superficially similar, there are fundamental differences between the networks trained by this new learning rule and radial basis function networks. These differences suggest that value unit networks may be better suited for learning some pattern classification tasks and for answering general questions related to the organization of neurophysiological systems.

[1]  Masafumi Hagiwara Novel backpropagation algorithm for reduction of hidden units and acceleration of convergence using artificial selection , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[2]  Nils J. Nilsson,et al.  Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  P A Getting,et al.  Emerging principles governing the operation of neural networks. , 1989, Annual review of neuroscience.

[4]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[5]  James L. McClelland,et al.  Levels indeed! A response to Broadbent , 1985 .

[6]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[7]  P. A. Sandon,et al.  A local interaction heuristic for adaptive networks , 1988, IEEE 1988 International Conference on Neural Networks.

[8]  James D. Keeler,et al.  Layered Neural Networks with Gaussian Hidden Units as Universal Approximations , 1990, Neural Computation.

[9]  Etienne Barnard,et al.  A comparison between criterion functions for linear classifiers, with an application to neural nets , 1989, IEEE Trans. Syst. Man Cybern..

[10]  R.J.F. Dow,et al.  Neural net pruning-why and how , 1988, IEEE 1988 International Conference on Neural Networks.

[11]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[12]  Steve Renals Radial basis function network for speech pattern classification , 1989 .

[13]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[14]  Bernard Widrow,et al.  Adaptive Signal Processing , 1985 .

[15]  Tomaso A. Poggio,et al.  Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[16]  Michael C. Mozer,et al.  Using Relevance to Reduce Network Size Automatically , 1989 .

[17]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[18]  James L. McClelland,et al.  A distributed, developmental model of word recognition and naming. , 1989, Psychological review.

[19]  R.P. Lippmann,et al.  Pattern classification using neural networks , 1989, IEEE Communications Magazine.

[20]  Dana H. Ballard,et al.  Cortical connections and parallel processing: Structure and function , 1986, Behavioral and Brain Sciences.

[21]  Ronald N. Bracewell,et al.  The Fourier Transform and Its Applications , 1966 .

[22]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.