Network Committees and Weighting Schemes

It is well-known that a combination of many different predictors can improve predictions. This combination is usually effected by majority in classification or by simple averaging in regression, but one can also use a weighted combination of the networks. The first section of this chapter summarises the main ideas of a recent study by Krogh and Vedelsby on network committees for simple interpolation tasks. The generalisation performance of the committee is seen to depend on the variation of the output of ensemble members, which the authors refer to as the ambiguity. The second section generalises the results to the prediction of conditional probability densities. A similar dependence on an ambiguity term is found, which suggests that diversity in the committee is a crucial requirement for improving the generalisation performance. The third section discusses aspects of the weighting scheme. The optimisation of the weights on the basis of the training data is shown to lead to a sub-optimal behaviour, and the evidence scheme is suggested as an approach that arises naturally within a Bayesian framework. However, numerical deficiencies of the latter are pointed out, and the chapter concludes with the introduction of an alternative scheme.