On the Selection of Weight Decay Parameter for Faulty Networks

The weight-decay technique is an effective approach to handle overfitting and weight fault. For fault-free networks, without an appropriate value of decay parameter, the trained network is either overfitted or underfitted. However, many existing results on the selection of decay parameter focus on fault-free networks only. It is well known that the weight-decay method can also suppress the effect of weight fault. For the faulty case, using a test set to select the decay parameter is not practice because there are huge number of possible faulty networks for a trained network. This paper develops two mean prediction error (MPE) formulae for predicting the performance of faulty radial basis function (RBF) networks. Two fault models, multiplicative weight noise and open weight fault, are considered. Our MPE formulae involve the training error and trained weights only. Besides, in our method, we do not need to generate a huge number of faulty networks to measure the test error for the fault situation. The MPE formulae allow us to select appropriate values of decay parameter for faulty networks. Our experiments showed that, although there are small differences between the true test errors (from the test set) and the MPE values, the MPE formulae can accurately locate the appropriate value of the decay parameter for minimizing the true test error of faulty networks.

[1]  E.E. Swartzlander,et al.  Digital neural network implementation , 1992, Eleventh Annual International Phoenix Conference on Computers and Communication [1992 Conference Proceedings].

[2]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[3]  Zhi-Hua Zhou,et al.  Evolving Fault-Tolerant Neural Networks , 2003, Neural Computing & Applications.

[4]  Masashi Sugiyama,et al.  Optimal design of regularization term and regularization parameter by subspace information criterion , 2002, Neural Networks.

[5]  Salvatore Cavalieri,et al.  A novel learning algorithm which improves the partial fault tolerance of multilayer neural networks , 1999, Neural Networks.

[6]  Terumine Hayashi,et al.  Enhancing both generalization and fault tolerance of multilayer neural networks , 2007, 2007 International Joint Conference on Neural Networks.

[7]  S.M. Fakhraie,et al.  Behavioral Fault Model for Neural Networks , 2009, 2009 International Conference on Computer Engineering and Technology.

[8]  Lizhong Wu,et al.  A Smoothing Regularizer for Feedforward and Recurrent Neural Networks , 1996, Neural Computation.

[9]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[10]  Ignacio Rojas,et al.  Obtaining Fault Tolerant Multilayer Perceptrons Using an Explicit Regularization , 2000, Neural Processing Letters.

[11]  Ignacio Rojas,et al.  An Accurate Measure for Multilayer Perceptron Tolerance to Weight Deviations , 1999, Neural Processing Letters.

[12]  L. Glass,et al.  Oscillation and chaos in physiological control systems. , 1977, Science.

[13]  Alan F. Murray,et al.  Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.

[14]  S. Himavathi,et al.  Feedforward Neural Network Implementation in FPGA Using Layer Multiplexing for Effective Resource Utilization , 2007, IEEE Transactions on Neural Networks.

[15]  John Moody,et al.  Note on generalization, regularization and architecture selection in nonlinear learning systems , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[16]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[17]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[18]  Zhi-Hua Zhou,et al.  Improving tolerance of neural networks against multi-node open fault , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[19]  Y. Takefuji,et al.  Functional-link net computing: theory, system architecture, and functionalities , 1992, Computer.

[20]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[21]  Thorsteinn S. Rögnvaldsson A Simple Trick for Estimating the Weight Decay Parameter , 1996, Neural Networks: Tricks of the Trade.

[22]  Dhananjay S. Phatak,et al.  Investigating the Fault Tolerance of Neural Networks , 2005, Neural Computation.

[23]  Andrew Chi-Sing Leung,et al.  Two regularizers for recursive least squared algorithms in feedforward multilayered neural networks , 2001, IEEE Trans. Neural Networks.

[24]  Ignacio Rojas,et al.  A Quantitative Study of Fault Tolerance, Noise Immunity, and Generalization Ability of MLPs , 2000, Neural Computation.

[25]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[26]  C. H. Sequin,et al.  Fault tolerance in artificial neural networks , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[27]  Steve W. Piche,et al.  The selection of weight accuracies for Madalines , 1995, IEEE Trans. Neural Networks.

[28]  Dhananjay S. Phatak,et al.  Complete and partial fault tolerance of feedforward neural nets , 1995, IEEE Trans. Neural Networks.

[29]  Roberto Battiti,et al.  Accelerated Backpropagation Learning: Two Optimization Methods , 1989, Complex Syst..

[30]  Robert I. Damper,et al.  Determining and improving the fault tolerance of multilayer perceptrons in a pattern-recognition application , 1993, IEEE Trans. Neural Networks.

[31]  Bernard Widrow,et al.  Sensitivity of feedforward neural networks to weight errors , 1990, IEEE Trans. Neural Networks.

[32]  Sheng Chen,et al.  Sparse modeling using orthogonal forward regression with PRESS statistic and regularization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  Nigel Steele,et al.  On parity problems and the functional-link artificial neural network , 2005, Neural Computing & Applications.

[34]  Ping Guo,et al.  Studies of model selection and regularization for generalization in neural networks with applications , 2002 .

[35]  Jan Larsen,et al.  DESIGN OF NEURAL NETWORK FILTERS , 1996 .

[36]  Andrew Chi-Sing Leung,et al.  A Fault-Tolerant Regularizer for RBF Networks , 2008, IEEE Transactions on Neural Networks.

[37]  Andrew Chi-Sing Leung,et al.  On the regularization of forgetting recursive least square , 1999, IEEE Trans. Neural Networks.

[38]  C. L. Philip Chen,et al.  Regularization parameter estimation for feedforward neural networks , 2003 .

[39]  Yogesh Singh,et al.  Feedforward sigmoidal networks - equicontinuity and fault-tolerance properties , 2004, IEEE Transactions on Neural Networks.

[40]  Mark J. L. Orr,et al.  Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.

[41]  Emilio Salinas,et al.  When Response Variability Increases Neural Network Robustness to Synaptic Noise , 2005, Neural Computation.

[42]  Klaus-Robert Müller,et al.  Asymptotic statistical theory of overtraining and cross-validation , 1997, IEEE Trans. Neural Networks.

[43]  S.M. Fakhraie,et al.  A Low-Cost Fault-Tolerant Approach for Hardware Implementation of Artificial Neural Networks , 2009, 2009 International Conference on Computer Engineering and Technology.

[44]  B. Liu,et al.  Error analysis of digital filters realized with floating-point arithmetic , 1969 .

[45]  Chilukuri K. Mohan,et al.  Modifying training algorithms for improved fault tolerance , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[46]  A. K. Rigler,et al.  Accelerating the convergence of the back-propagation method , 1988, Biological Cybernetics.

[47]  Yogesh Singh,et al.  Fault tolerance of feedforward artificial neural networks- a framework of study , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[48]  ImplementationsJames B. BurrDepartment Digital Neural Network Implementations , 1995 .