Neural Model Selection: How to Determine the Fittest Criterion?

Based on recent results about the least-squares estimation for non-linear time series, M. Mangeas and J.F. Yao [6] proposed an identification criterion of neural architectures. So, for a given series of T observations, we know that for any γ e R+* the selected neural model (architecture + weights) that minimize the least square criterion LSC = MSE +γlnT/T x n (the term n denotes the number of weights) converges almost surely towards the “true” model, when T grows to infinity. Nevertheless, when few observations are available, an identification method based on this criterion (such the pruning method named Statistical Stepwise Method (SSM) [1]) can yield different neural models. In this paper, we propose a heuristic for setting the value of γ up, with respect of the series we deal with (its complexity and the fixed number T). The basic idea is to split the set of observations into two subsets, following the well-known cross-validation method, and to perform the SSM methodology (using the the LSC criterion on the first subset (the learning set) for different values of γ. Once the best value of γ is found (the one minimizing the MSE on the second subset (the validation set)), we can use the identification scheme on the whole set of data.