Selecting neural network architectures via the prediction risk: application to corporate bond rating prediction

The notion of generalization can be defined precisely as the prediction risk, the expected performance of an estimator on new observations. The authors propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select the optimal network architecture. The prediction risk must be estimated from the available data. The authors approximate the prediction risk by v-fold cross-validation and asymptotic estimates of generalized cross-validation or H. Akaike's (1970) final prediction error. They apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of complete a priori information that could be used to impose a structure to the network architecture.<<ETX>>