论文信息 - Stacking Bagged and Dagged Models

Stacking Bagged and Dagged Models

In this paper, we investigate the method of stacked generalization in combining models derived from diierent subsets of a training dataset by a single learning algorithm, as well as diierent algorithms. The simplest way to combine predictions from competing models is majority vote, and the eeect of the sampling regime used to generate training subsets has already been studied in this context|when bootstrap samples are used the method is called bagging, and for disjoint samples we call it dagging. This paper extends these studies to stacked generalization, where a learning algorithm is employed to combine the models. This yields new methods dubbed bag-stacking and dag-stacking. We demonstrate that bag-stacking and dag-stacking can be eeective for classiication tasks even when the training samples cover just a small fraction of the full dataset. In contrast to earlier bagging results, we show that bagging and bag-stacking work for stable as well as unstable learning algorithms, as do dagging and dag-stacking. We nd that bag-stacking (dag-stacking) almost always has higher predictive accuracy than bagging (dagging), and we also show that bag-stacking models derived using two diier-ent algorithms is more eeective than bagging.

Ian H. Witten | Kai Ming Ting | I. Witten | K. Ting