For any real-world generalization problem, there are always many generalizers which could be applied to the problem. This chapter discusses some algorithmic techniques for dealing with this multiplicity of possible generalizers. All of these techniques rely on partitioning the provided learning set in two, many different times. The first technique discussed is cross-validation, which is a winner-takes-all strategy (based on the behavior of the generalizers on the partitions of the learning set, it picks one single generalizer from amongst the set of candidate generalizers, and tells you to use that generalizer). The second technique discussed, the one this chapter concentrates on, is an extension of cross-validation called stacked generalization. As opposed to cross-validation's winner-takes-all strategy, stacked generalization uses the partitions of the learning set to combine the generalizers, in a non-linear manner, via another generalizer (hence the term ``stacked generalization''). This chapter ends by discussing some possible extensions of stacked generalization.
[1]
David H. Wolpert,et al.
Constructing a generalizer superior to NETtalk via a mathematical theory of generalization
,
1990,
Neural Networks.
[2]
David H. Wolpert,et al.
A Mathematical Theory of Generalization: Part II
,
1990,
Complex Syst..
[3]
Sholom M. Weiss,et al.
Computer Systems That Learn
,
1990
.
[4]
David H. Wolpert,et al.
On the Connection between In-sample Testing and Generalization Error
,
1992,
Complex Syst..
[5]
J. Mesirov,et al.
Hybrid system for protein secondary structure prediction.
,
1992,
Journal of molecular biology.
[6]
David H. Wolpert,et al.
Stacked generalization
,
1992,
Neural Networks.