Neural Network Ensembles, Cross Validation, and Active Learning

Learning of continuous valued functions using neural network ensembles (committees) can give improved accuracy, reliable estimation of the generalization error, and active learning. The ambiguity is defined as the variation of the output of ensemble members averaged over unlabeled data, so it quantifies the disagreement among the networks. It is discussed how to use the ambiguity in combination with cross-validation to give a reliable estimate of the ensemble generalization error, and how this type of ensemble cross-validation can sometimes improve performance. It is shown how to estimate the optimal weights of the ensemble members using unlabeled data. By a generalization of query by committee, it is finally shown how the ambiguity can be used to select new training data to be labeled in an active learning scheme.

[1]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  H. Sebastian Seung,et al.  Information, Prediction, and Query by Committee , 1992, NIPS.

[3]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[4]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[5]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[6]  Ronny Meir,et al.  Bias, variance and the combination of estimators; The case of linear least squares , 1995 .