Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances

Active learners can significantly reduce the number of labeled training instances to learn a classification function by actively selecting only the most informative instances for labeling. Most existing methods try to select the instances which could halve the version space size after each sampling. In contrast to them, we try to reduce the volume of the version space more than half. Therefore, a sampling criterion of misclassification is presented. Furthermore, in each iteration of active learning, a strong classifier was introduced to estimate the target function for evaluation of the misclassification degree of an instance. We use a modified popular ensemble learning method DECORATE as the strong classifier which was enhanced by the unlabeled instances with high certainty by the current base classifier. The experiments show that the proposed method outperforms the traditional sampling methods on most selected datasets.