On the accuracy of mapping by neural networks trained by backpropagation with forgetting

Mapping properties of multi-layer, feedforward artificial neural networks are analyzed using modified backpropagation training with forgetting (decay) of the connection weights. Neural nets trained by forgetting algorithm are not sensitive to the initial choice of the network, and the trained network structure can be used for knowledge acquisition regarding the feature classes. The accuracy of the non-linear mapping realized by layered neural networks is limited in the sense of minimum classification error and it can be estimated based on the a posteriori probability densities of the training classes. It is shown in this paper that backpropagation with forgetting is a convenient tool to implement finite accuracy of learning. The proposed strategy has been used for anomaly detection in actual time series. It is shown that neural networks trained by forgetting algorithm have better generalization capabilities than those trained by standard backpropagation. The analyzed feature classes have been characterized by making use of the information extracted from the structure of the trained network.

[1]  Shigeki Miyake,et al.  Bayes statistical behavior and valid generalization of pattern classifying neural networks , 1991, IEEE Trans. Neural Networks.

[2]  Omid Omidvar,et al.  Neural Networks and Pattern Recognition , 1997 .

[3]  Masumi Ishikawa Structural learning in neural networks , 1994 .

[4]  D. B. Fogel,et al.  AN INFORMATION CRITERION FOR OPTIMAL NEURAL NETWORK SELECTION , 1990, 1990 Conference Record Twenty-Fourth Asilomar Conference on Signals, Systems and Computers, 1990..

[5]  Richard J. Mammone,et al.  Growing and Pruning Neural Tree Networks , 1993, IEEE Trans. Computers.

[6]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[7]  Ehud D. Karnin,et al.  A simple procedure for pruning back-propagation trained neural networks , 1990, IEEE Trans. Neural Networks.

[8]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[9]  Eric B. Bartlett,et al.  Dynamic node architecture learning: An information theoretic approach , 1994, Neural Networks.

[10]  Masumi Ishikawa,et al.  Learning of Modular Structured Networks , 1995, Artif. Intell..

[11]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[12]  Masaharu Kitamura,et al.  Anomaly detection by neural network models and statistical time series analysis , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[13]  Anil K. Jain,et al.  Neural networks and pattern recognition , 1994 .

[14]  Jacek M. Zurada,et al.  Computational Intelligence: Imitating Life , 1994 .

[15]  Robert Kozma,et al.  Identification of flow patterns by neutron noise analysis during actual coolant boiling in thin rectangular channels , 1992 .

[16]  Mohamad T. Musavi,et al.  On the Generalization Ability of Neural Network Classifiers , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[18]  D. G. Watts,et al.  Spectral analysis and its applications , 1968 .

[19]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[20]  James D. Hamilton Time Series Analysis , 1994 .

[21]  Steven K. Rogers,et al.  Bayesian selection of important features for feedforward neural networks , 1993, Neurocomputing.

[22]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.