Self-regulation of model order in feedforward neural networks

Despite the presence of theoretical results, the application of feedforward neural networks is hampered by the lack of systematic procedural methods for determining the number of hidden neurons to use. The number of hidden layer neurons determine the order of the neural network model and consequently the generalization performance of the network. This paper puts into perspective the approaches used to address this problem and presents a new paradigm which uses dependent evolution of hidden layer neurons to self-regulate the model order. We show through simulations that despite an abundance of free-parameters (i.e. starting with a larger than necessary network), the proposed paradigm allows for localization of specializing hidden layer neurons with the unspecialized hidden layer neurons behaving similarly. These similarly behaving neurons reduce the model order and allow for the benefits of a smaller sized network. Hints on analytically understanding the behavior are also noted.

[1]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[2]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[3]  Tom Murray,et al.  Predicting sun spots using a layered perceptron neural network , 1996, IEEE Trans. Neural Networks.

[4]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[5]  Javier R. Movellan,et al.  Benefits of gain: speeded learning and minimal hidden layers in back-propagation networks , 1991, IEEE Trans. Syst. Man Cybern..

[6]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[7]  Hong Chen,et al.  Approximation capability in C(R¯n) by multilayer feedforward networks and related problems , 1995, IEEE Trans. Neural Networks.

[8]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[9]  D. Yeung,et al.  Constructive feedforward neural networks for regression problems : a survey , 1995 .

[10]  David B. Fogel An information criterion for optimal neural network selection , 1991, IEEE Trans. Neural Networks.

[11]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[12]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[13]  R. Kothari,et al.  On lateral connections in feed-forward neural networks , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[14]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[15]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[16]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .