Regularised Centre Recruitment in Radial Basis Function Networks
暂无分享,去创建一个
and this could either be applied between choosing sets of centres with multiple runs of RFS or between choosing individual centres within one run of RFS. In the latter case this requires continual adjustment of the matrix G m to account for the changing value of. I tried this with the example from gure 1 and from an initial value of = 1 the re-estimations converged to a value close to 0:02 after selecting the rst few centres. Perhaps this explains why my original guess of = 0:02 worked well to begin with! It is also possible to attempt to choose the best value of r (the basis function radii in (5)) from the training data. In the case of the above Bayesian scheme it is not possible to derive a re-estimation formula for r and it is necessary to resort to non-linear optimisation techniques to maximise the evidence. However, it is simpler to use a heuristic to set r (examples are in Moody and Darken (1989)) and then try to optimise. 7 Conclusions Ordinary centre selection terminated using a xed threshold on the unexplained variance and without any regularisation is liable to overrtting even when the true noise variance is known. Cross-validation as a termination criterion reduces the tendency to overrt but has no eeect in areas of the input space outside the training set. If, in addition to cross-validation, zero-order regularisation is used in the selection process then the behaviour of the tted function outside the training set is more constrained and this may lead to better extrapolation performance for target functions which are known a priori to be smooth. Centre selection, cross-validation and regularisation are an advantageous combination in RBF networks. Selection economises on centres, cross-validation avoids gross overrt and regularisation smooths the t outside the training set. Although regularisation alone may suuce to avoid overrt, parsimony is often an important consideration in practical applications. This is particularly true in high-dimensional spaces where local methods, like RBF networks, suuer from the \curse of dimensionality". Figure 3: Average RMS errors for OLS (solid curve) and RFS (dashed curve) using the naive stopping criterion (20) for diierent values of ^. The set of test points used covered the same regions from which the samples were drawn. to modify the method of section 4 to be able backwardly eliminate as well as forwardly select. However, parsimony, and not optimality, is …
[1] A. Barron,et al. Discussion: Multivariate Adaptive Regression Splines , 1991 .