Integrating multiple information of active learning for image classification

In the application of image classification, active learning algorithm can effectively alleviate the efforts of labeling by selecting the most informative instances for user annotation, as well as obtain a satisfactory classifier. Traditional active learning methods do not consider the cost of manual labeling, which is usually regarded as the same. They focus on minimizing the classification error, aiming at improving the classifier performance. However, in fact, the user annotation cost is not equal and changes dynamically. We introduce the value of the information framework to measure the instance informativeness, which including misclassification risk and the cost of user annotation. While the value of information is based on probability over the current classifier, only taking into the labeled examples account, thus it may query the outliers. In order to simultaneously lever the distribution information of a large amount of the remaining unlabeled instances, we use information density to measure the representativeness of the sample. To this end, we propose an integrating multiple information of active learning method for image classification (IMIM), which incorporates the strength of both value of information and information density measure criteria by a heuristic weighting strategy. At last, select the most informative instance by the expected error reduction method. Compared with the state of art method, experimental results on diverse datasets demonstrate the effectiveness of our proposed method.

[1]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[2]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[3]  Alekh Agarwal,et al.  Selective sampling algorithms for cost-sensitive multiclass prediction , 2013, ICML.

[4]  Kristen Grauman,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, CVPR.

[5]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[6]  Nikolaos Papanikolopoulos,et al.  Scalable Active Learning for Multiclass Image Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, CVPR.

[8]  Ashish Kapoor,et al.  Active learning for large multi-class problems , 2009, CVPR.

[9]  Adam A. Miller,et al.  ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION , 2011, 1106.2832.

[10]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Sofus A. Macskassy Using graph-based metrics with empirical risk minimization to speed up active learning on networked data , 2009, KDD.

[12]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[13]  Jason Baldridge,et al.  Active Learning and the Total Cost of Annotation , 2004, EMNLP.

[14]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[15]  Xin Li,et al.  Adaptive Active Learning for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Eric Horvitz,et al.  Selective Supervision: Guiding Supervised Learning with Decision-Theoretic Active Learning , 2007, IJCAI.