Toward Optimally Distributed Computation

This article introduces the concept of optimally distributed computation in feedforward neural networks via regularization of weight saliency. By constraining the relative importance of the parameters, computation can be distributed thinly and evenly throughout the network. We propose that this will have beneficial effects on fault-tolerance performance and generalization ability in large network architectures. These theoretical predictions are verified by simulation experiments on two problems: one artificial and the other a real-world task. In summary, this article presents regularization terms for distributing neural computation optimally.

[1]  Chalapathy Neti,et al.  Maximally fault tolerant neural networks , 1992, IEEE Trans. Neural Networks.

[2]  Alan F. Murray,et al.  Can deterministic penalty terms model the effects of synaptic weight noise on network fault-tolerance? , 1995, Int. J. Neural Syst..

[3]  Alan F. Murray,et al.  The application of neural networks to the papermaking industry , 1999, IEEE Trans. Neural Networks.

[4]  Andrew R. Barron,et al.  Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.

[5]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[6]  Eduardo Ros,et al.  A New Measurement of Noise Immunity and Generalization Ability for MLPs , 1999, Int. J. Neural Syst..

[7]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[8]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[9]  Alan F. Murray,et al.  Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.

[10]  C. M. Bishop,et al.  Curvature-Driven Smoothing in Backpropagation Neural Networks , 1992 .

[11]  Martin D. Emmerson Fault tolerance and redundancy in neural networks , 1992 .

[12]  C. H. Sequin,et al.  Fault tolerance in artificial neural networks , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[13]  A. F. Murray,et al.  Modelling weight- and input-noise in MLP learning , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[14]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[15]  Dhananjay S. Phatak,et al.  Complete and partial fault tolerance of feedforward neural nets , 1995, IEEE Trans. Neural Networks.

[16]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[17]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[18]  Alan F. Murray,et al.  Fault-tolerance via weight-noise in analogue VLSI implementations a case study with EPSILON , 1997 .

[19]  Lars Asplund,et al.  Neural Networks for Preventive Traffic Control in Broadband ATM Networks , 1993 .

[20]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[21]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[22]  Wray L. Buntine,et al.  Bayesian Back-Propagation , 1991, Complex Syst..

[23]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[24]  Torsten Lehmann,et al.  Hardware Learning in Analogue VLSI Neural Networks , 1995 .

[25]  Jürgen Schmidhuber,et al.  Flat Minima , 1997, Neural Computation.

[26]  Lars Asplund,et al.  Neural networks for admission control in an ATM network , 1994 .

[27]  A. F. Murray,et al.  Fault tolerance via weight noise in analog VLSI implementations of MLPs-a case study with EPSILON , 1998 .

[28]  Alan F. Murray,et al.  Analogue imprecision in MLP training , 1996 .