论文信息 - Can deterministic penalty terms model the effects of synaptic weight noise on network fault-tolerance?

Can deterministic penalty terms model the effects of synaptic weight noise on network fault-tolerance?

This paper investigates fault tolerance in feedforward neural networks, for a realistic fault model based on analog hardware. In our previous work with synaptic weight noise we showed significant fault tolerance enhancement over standard training algorithms. We proposed that when introduced into training, weight noise distributes the network computation more evenly across the weights and thus enhances fault tolerance. Here we compare those results with an approximation to the mechanisms induced by stochastic weight noise, incorporated into training deterministically via penalty terms. The penalty terms are an approximation to weight saliency and therefore, in addition, we assess a number of other weight saliency measures and perform comparison experiments. The results show that the first term approximation is an incomplete model of weight noise in terms of fault tolerance. Also the error Hessian is shown to be the most accurate measure of weight saliency.

Alan F. Murray | Peter J. Edwards

[1] J. I. Minnix. Fault tolerance of the backpropagation neural network trained on noisy inputs , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[2] Petri Koistinen,et al. Using additive noise in back-propagation training , 1992, IEEE Trans. Neural Networks.

[3] Kiyotoshi Matsuoka,et al. Noise injection into inputs in back-propagation learning , 1992, IEEE Trans. Syst. Man Cybern..

[4] Bradley W. Dickinson,et al. Trellis codes, receptive fields, and fault tolerant, self-repairing neural networks , 1990, IEEE Trans. Neural Networks.

[5] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[6] Ali A. Minai,et al. Perturbation response in feedforward networks , 1994, Neural Networks.

[7] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[8] Martin D. Emmerson. Fault tolerance and redundancy in neural networks , 1992 .

[9] C. H. Sequin,et al. Fault tolerance in artificial neural networks , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[10] Alan F. Murray,et al. Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.

[11] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.

[12] Brian D. Ripley,et al. Neural Networks and Related Methods for Classification , 1994 .

[13] Marwan A. Jabri,et al. Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks , 1992, IEEE Trans. Neural Networks.

[14] Robert M. French,et al. Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks , 1991 .

[15] Michael J. Carter,et al. Comparative Fault Tolerance of Parallel Distributed Processing Networks , 1994, IEEE Trans. Computers.

[16] Chalapathy Neti,et al. Maximally fault tolerant neural networks , 1992, IEEE Trans. Neural Networks.

[17] Alan F. Murray,et al. Pulse Stream Vlsi Circuits And Systems: The Epsilon Neural Network Chipset , 1993, Int. J. Neural Syst..

[18] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[19] Bernard Widrow,et al. Sensitivity of feedforward neural networks to weight errors , 1990, IEEE Trans. Neural Networks.

[20] Christopher M. Bishop,et al. Current address: Microsoft Research, , 2022 .

[21] C. M. Bishop,et al. Curvature-Driven Smoothing in Backpropagation Neural Networks , 1992 .

[22] H. White. Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models , 1989 .

[23] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[24] Peter J. Edwards. Analogue imprecision in MLPs implications and learning improvements , 1994 .

[25] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[26] Dhananjay S. Phatak,et al. Complete and partial fault tolerance of feedforward neural nets , 1995, IEEE Trans. Neural Networks.

[27] Barry W. Johnson. Design & analysis of fault tolerant digital systems , 1988 .

[28] Paul W. Munro,et al. Nets with Unreliable Hidden Nodes Learn Error-Correcting Codes , 1992, NIPS.

[29] Geoffrey E. Hinton,et al. Experiments on Learning by Back Propagation. , 1986 .

[30] Edward A. Rietman,et al. Back-propagation learning and nonidealities in analog neural network hardware , 1991, IEEE Trans. Neural Networks.

[31] C. Lee Giles,et al. Synaptic noise in dynamically-driven recurrent neural networks: convergence and generalization , 1994 .

[32] Terrence J. Sejnowski,et al. Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.