Empirical generalization assessment of neural network models

This paper addresses the assessment of generalization performance of neural network models by use of empirical techniques. We suggest to use the cross-validation scheme combined with a resampling technique to obtain an estimate of the generalization performance distribution of a specific model. This enables the formulation of a bulk of new generalization performance measures. Numerical results demonstrate the viability of the approach compared to the standard technique of using algebraic estimates like the FPE. Moreover, we consider the problem of comparing the generalization performance of different competing models. Since all models are trained on the same data, a key issue is to take this dependency into account. The optimal split of the data set of size N into a cross-validation set of size N/spl gamma/ and a training set of size N(1-/spl gamma/) is discussed. Asymptotically (large data sees), /spl gamma//sub opt//spl rarr/1 such that a relatively larger amount is left for validation.

[1]  Jan Larsen,et al.  A generalization error estimate for nonlinear systems , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[2]  Lars Kai Hansen,et al.  Linear unlearning for cross-validation , 1996, Adv. Comput. Math..

[3]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[4]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[5]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[6]  Lars Kai Hansen,et al.  Generalization performance of regularized neural network models , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[7]  Jan Larsen,et al.  DESIGN OF NEURAL NETWORK FILTERS , 1996 .

[8]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[9]  John Moody,et al.  Note on generalization, regularization and architecture selection in nonlinear learning systems , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[10]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[11]  H. White Consequences and Detection of Misspecified Nonlinear Regression Models , 1981 .

[12]  S. Y. Kung Neural Networks for Signal Processing II : proceedings of the 1992 IEEE-SP Workshop , 1992 .

[13]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[14]  Godfried T. Toussaint,et al.  Bibliography on estimation of misclassification , 1974, IEEE Trans. Inf. Theory.

[15]  D. Hinkley,et al.  Jackknifing in Nonlinear Regression , 1980 .