Measuring and Improving Neural Network Generalization for Model Updating

This paper compares various techniques of measuring the generalization ability of a neural network used for model-updating purposes. An appropriate metric for measuring generalization ability is suggested, and it is used to investigate and compare various neural network architectures and training algorithms. The effect of noise on generalization ability is considered, and it is shown that the form of the noise does not appear important to the networks. This implies that the optimum training location may be obtained by considering a simple noise model such as Gaussian noise. Various radial basis function neurons and training algorithms are considered. Significant improvements to generalization ability are noted by merging the holdout and training data sets before training the second layer of the network, after the network architecture has been decided. The Gaussian radial basis function is rejected as the radial basis function of choice, due to uncertainty regarding an appropriate value for the spread constant. It is noted that several alternative radial basis functions without spread constants, such as the thin-plate spline, give excellent results. Finally, the use of jitter and committees to improve the generalization ability of networks is considered. It is found that jitter makes neither improvement nor degrades the results. It is also found that a committee of networks performs better than any single network. A good method of generating committee members is to split the available data evenly into multiple random holdout and training data sets.