Learning with Product Units

Product units provide a method of automatically learning the higher-order input combinations required for efficient learning in neural networks. However, we show that problems are encountered when using backpropagation to train networks containing these units. This paper examines these problems, and proposes some atypical heuristics to improve learning. Using these heuristics a constructive method is introduced which solves well-researched problems with significantly less neurons than previously reported. Secondly, product units are implemented as candidate units in the Cascade Correlation (Fahlman & Lebiere, 1990) system. This resulted in smaller networks which trained faster than when using sigmoidal or Gaussian units.

[1]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[2]  Richard M. Karp,et al.  The Differencing Method of Set Partitioning , 1983 .

[3]  Marcus Frean,et al.  The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[4]  Timur Ash,et al.  Dynamic node creation in backpropagation networks , 1989 .

[5]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[6]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[7]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[8]  A. Lapedes,et al.  Nonlinear Signal Processing Using Neural Networks , 1987 .

[9]  David E. Rumelhart,et al.  Product Units: A Computationally Powerful and Biologically Plausible Extension to Backpropagation Networks , 1989, Neural Computation.

[10]  Nicholas J. Redding,et al.  Constructive higher-order network that is polynomial time , 1993, Neural Networks.

[11]  Colin Giles,et al.  Learning, invariance, and generalization in high-order neural networks. , 1987, Applied optics.

[12]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[13]  Michael R. W. Dawson,et al.  Modifying the Generalized Delta Rule to Train Networks of Non-monotonic Processors for Pattern Classification , 1992 .

[14]  A. Lapedes,et al.  Nonlinear signal processing using neural networks: Prediction and system modelling , 1987 .