Noise suppression in training examples for improving generalization capability

For the supervised learning problem, error correcting memorization learning was proposed in order to suppress noise in teacher signals. In this paper, generalization capability of the learning method is discussed. Generalization capability is evaluated based on the projection learning criterion. We give a necessary and sufficient condition for error correcting memorization learning to provide the same level of generalization as projection learning, and suggest how to choose a training set so as to satisfy the obtained condition. Moreover, it is revealed that noise suppression based on the error correcting memorization learning criterion always has a good effect on improving generalization to the level of projection learning.

[1]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[2]  R. Schatten,et al.  Norm Ideals of Completely Continuous Operators , 1970 .

[3]  S. Bergman The kernel function and conformal mapping , 1950 .

[4]  Masashi Sugiyama Functional analytic approach to model selection-subspace information criterion , 1999 .

[5]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[6]  Hidemitsu Ogawa,et al.  Admissibility of memorization learning with respect to projection learning in the presence of noise , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[7]  Hidemitsu Ogawa,et al.  Neural network learning, generalization and over-learning , 1992 .

[8]  H. Akaike A new look at the statistical model identification , 1974 .

[9]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[10]  Rene F. Swarttouw,et al.  Orthogonal polynomials , 2020, NIST Handbook of Mathematical Functions.

[11]  Tomaso Poggio,et al.  Computational vision and regularization theory , 1985, Nature.

[12]  Hironori Ogawa,et al.  Projection Filter Regularization Of Ill-Conditioned Problem , 1987, Other Conferences.

[13]  J. G. Taylor,et al.  ARTIFICIAL NEURAL NETWORKS, 2 , 1992 .

[14]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[15]  Hidemitsu Ogawa,et al.  Noise suppression in training data for improving generalization , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[16]  Hidemitsu Ogawa,et al.  Error correcting memorization learning for noisy training examples , 2001, Neural Networks.

[17]  Klaus-Robert Müller,et al.  Asymptotic statistical theory of overtraining and cross-validation , 1997, IEEE Trans. Neural Networks.

[18]  Saburou Saitoh,et al.  Theory of Reproducing Kernels and Its Applications , 1988 .

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[20]  Adi Ben-Israel,et al.  Generalized inverses: theory and applications , 1974 .

[21]  Yukihiko Yamashita,et al.  Properties of averaged projection filter for image restoration , 1992, Systems and Computers in Japan.

[22]  S. Saitoh Integral Transforms, Reproducing Kernels and Their Applications , 1997 .