On Deriving the Second-Stage Training Set for Trainable Combiners

Unlike fixed combining rules, the trainable combiner is applicable to ensembles of diverse base classifier architectures with incomparable outputs. The trainable combiner, however, requires the additional step of deriving a second-stage training dataset from the base classifier outputs. Although several strategies have been devised, it is thus far unclear which is superior for a given situation. In this paper we investigate three principal training techniques, namely the re-use of the training dataset for both stages, an independent validation set, and the stacked generalization. On experiments with several datasets we have observed that the stacked generalization outperforms the other techniques in most situations, with the exception of very small sample sizes, in which the re-using strategy behaves better. We illustrate that the stacked generalization introduces additional noise to the second-stage training dataset, and should therefore be bundled with simple combiners that are insensitive to the noise. We propose an extension of the stacked generalization approach which significantly improves the combiner robustness.

[1]  Robert P. W. Duin,et al.  Selection/Extraction of Spectral Regions for Autofluorescence Spectra Measured in the Oral Cavity , 2004, SSPR/SPR.

[2]  Robert P. W. Duin,et al.  The combining classifier: to train or not to train? , 2002, Object recognition supported by user interaction for service robots.

[3]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[4]  Gian Luca Marcialis,et al.  An Experimental Comparison of Fixed and Trained Fusion Rules for Crisp Classifier Outputs , 2002, Multiple Classifier Systems.

[5]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[6]  Robert P. W. Duin,et al.  A Matlab Toolbox for Pattern Recognition , 2004 .

[7]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[8]  Sarunas Raudys Multiple Classification Systems in the Context of Feature Extraction and Selection , 2002, Multiple Classifier Systems.

[9]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[10]  Sarunas Raudys,et al.  Reduction of the Boasting Bias of Linear Experts , 2002, Multiple Classifier Systems.

[11]  Robert P. W. Duin,et al.  Experiments with Classifier Combining Rules , 2000, Multiple Classifier Systems.

[12]  R. Tibshirani,et al.  Combining Estimates in Regression and Classification , 1996 .

[13]  Ching Y. Suen,et al.  Multiple Classifier Combination Methodologies for Different Output Levels , 2000, Multiple Classifier Systems.

[14]  Günther Palm,et al.  Decision templates for the classification of bioacoustic time series , 2003, Inf. Fusion.