An Evaluation of Grading Classifiers

In this paper, we discuss grading, a meta-classification technique that tries to identify and correct incorrect predictions at the base level. While stacking uses the predictions of the base classifiers as meta-level attributes, we use "graded" predictions (i.e., predictions that have been marked as correct or incorrect) as meta-level classes. For each base classifier, one meta classifier is learned whose task is to predict when the base classifier will err. Hence, just like stacking may be viewed as a generalization of voting, grading may be viewed as a generalization of selection by cross-validation and therefore fills a conceptual gap in the space of meta-classification schemes. Our experimental evaluation shows that this technique results in a performance gain that is quite comparable to that achieved by stacking, while both, grading and stacking outperform their simpler counter-parts voting and selection by cross-validation.