Weight smoothing to improve network generalization

A weight smoothing algorithm is proposed in this paper to improve a neural network's generalization capability. The algorithm can be used when the data patterns to be classified are presented on an n-dimensional grid (n>/=1) and there exists some correlations among neighboring data points within a pattern. For a fully-interconnected feedforward net, no such correlation information is embedded into the architecture. Consequently, the correlations can only be extracted through sufficient amount of network training. With the proposed algorithm, a smoothing constraint is incorporated into the objective function of backpropagation to reflect the neighborhood correlations and to seek those solutions that have smooth connection weights. Experiments were performed on problems of waveform classification, multifont alphanumeric character recognition, and handwritten numeral recognition. The results indicate that (1) networks trained with the algorithm do have smooth connection weights, and (2) they generalize better.

[1]  Kunihiko Fukushima,et al.  A neural network for visual pattern recognition , 1988, Computer.

[2]  Paulo J. G. Lisboa,et al.  Translation, rotation, and scale invariant pattern recognition by high-order neural networks and moment classifiers , 1992, IEEE Trans. Neural Networks.

[3]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[4]  J. Wang,et al.  Automatic rule generation for machine printed character recognition using multiple neural networks , 1991, IEEE 1991 International Conference on Systems Engineering.

[5]  K. Yamada,et al.  Handwritten numeral recognition by multilayered neural network with improved learning algorithm , 1989, International 1989 Joint Conference on Neural Networks.

[6]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[7]  Jack L. Meador,et al.  Encoding a priori information in feedforward networks , 1991, Neural Networks.

[8]  Kunihiko Fukushima,et al.  Neocognitron: A hierarchical neural network capable of visual pattern recognition , 1988, Neural Networks.

[9]  Lawrence D. Jackel,et al.  Hardware requirements for neural network pattern classifiers: a case study and implementation , 1992, IEEE Micro.

[10]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[11]  D. M. Titterington,et al.  A Study of Methods of Choosing the Smoothing Parameter in Image Restoration by Regularization , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[13]  J.S.N. Jean A new distance measure for binary images , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[14]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Mohamad T. Musavi,et al.  A neural network approach to character recognition , 1989, Neural Networks.

[16]  I. Guyon,et al.  Handwritten digit recognition: applications of neural network chips and automatic learning , 1989, IEEE Communications Magazine.

[17]  Yann LeCun,et al.  Handwritten zip code recognition with multilayer networks , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.