Image classification using a set of labeled and unlabeled images

Image classification into meaningful classes is essentially a supervised pattern recognition problem. These classes include indoor, outdoor, landscape, urban, faces, etc. The recognition problem necessitates a large set of labeled examples for training the classifier. Any stratagem, which reduces the burden of labeling, is therefore very important to the deployment of such classifiers in practical applications. In this paper we show that the labeled training set can be augmented by an unlabeled set of examples in order to boost the performance of the classifier. In general, the set of unlabeled examples is not guaranteed to improve the classifier performance. We show that if the actual examples to be labeled are automatically selected through an unsupervised clustering step, the performance is more likely to improve with the unlabeled set. In this paper, we first present a modified EM algorithm, which combined labeled and unlabeled sets for training. We then apply this algorithm to image classification. Using mutually exclusive classes we show that the clustering step is crucial to the improvement in classifier performance.