Practice makes perfect: An adaptive active learning framework for image classification

Active learning is an effective method for iteratively selecting a subset of images from an unlabeled dataset. One of the most widely used active learning strategies is uncertainty sampling. However, traditional sampling strategies do not take the category of samples into consideration, and the selected images do not reflect the desired training distribution, leading to the result that additional labeling work needs to be done. To deal with these problems, from the aspect of visual perception, we improve the traditional entropy-based uncertainty sampling strategy by introducing a certainty measurement estimated by a bag-of-visual-words (BoVW). The Rescorla-Wagner perceptive model is utilized to quantify the stop criterion. This method differs from previous approaches that treated sampling and classifying process separately: we treat the learning process as a uniform model by proposing a new evolving sample selection method that uses the unified negative-accelerated learning principle and takes category distribution into consideration. A classifier is trained to provide category distributions for the sampling process to improve its sampling performance and reduce additional annotation costs for the human annotator. During the training process, weights for both modules are adaptively initialized by the inner similarity of sample set measured by structural similarity (SSIM), and dynamically adjusted according to the learning process of the human. In addition to the regular tests that are commonly utilized by traditional sampling methods, the transfer test, based on transfer learning theory, is utilized to further evaluate the performance of different sampling strategies. Experimental results on real world datasets show that our active sampling framework outperforms both baseline and state-of-the-art adaptive active learning strategies.

[1]  N. H. C. Yung,et al.  Scene categorization via contextual visual words , 2010, Pattern Recognit..

[2]  Fei-FeiLi,et al.  Learning generative visual models from few training examples , 2007 .

[3]  Nikolaos Papanikolopoulos,et al.  Breaking the interactive bottleneck in multi-class classification with active selection and binary feedback , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Pietro Perona,et al.  Entropy-based active learning for object recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Michael Lindenbaum,et al.  Selective Sampling for Nearest Neighbor Classifiers , 1999, Machine Learning.

[6]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[7]  Takafumi Kanamori,et al.  Pool-based active learning with optimal sampling distribution and its information geometrical interpretation , 2007, Neurocomputing.

[8]  Ming Yang,et al.  Discovery of Collocation Patterns: from Visual Words to Visual Phrases , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  D. Angluin Queries and Concept Learning , 1988 .

[10]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[11]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Naftali Tishby,et al.  Margin based feature selection - theory and algorithms , 2004, ICML.

[13]  Jaime G. Carbonell,et al.  Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.

[14]  Qi Tian,et al.  Constructing Concept Lexica With Small Semantic Gaps , 2010, IEEE Transactions on Multimedia.

[15]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[16]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Céline Hudelot,et al.  Hierarchical image annotation using semantic hierarchies , 2012, CIKM.

[18]  L. Allan,et al.  The widespread influence of the Rescorla-Wagner model , 1996, Psychonomic bulletin & review.

[19]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  Alexandre X. Falcão,et al.  Active learning paradigms for CBIR systems based on optimum-path forest classification , 2011, Pattern Recognit..

[21]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[22]  Nenghai Yu,et al.  Semantics-Preserving Bag-of-Words Models and Applications , 2010, IEEE Transactions on Image Processing.

[23]  Silvio Savarese,et al.  Discriminative Object Class Models of Appearance and Shape by Correlatons , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24]  Jun Zhou,et al.  Active learning SVM with regularization path for image classification , 2014, Multimedia Tools and Applications.

[25]  Chong-Wah Ngo,et al.  Video event detection using motion relativity and visual relatedness , 2008, ACM Multimedia.

[26]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[27]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Russell Greiner,et al.  Optimistic Active-Learning Using Mutual Information , 2007, IJCAI.

[30]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[31]  Richard Dosselmann,et al.  A comprehensive assessment of the structural similarity index , 2011, Signal Image Video Process..

[32]  Matthieu Cord,et al.  A comparison of active classification methods for content-based image retrieval , 2004, CVDB '04.

[33]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[34]  Daphne Koller,et al.  Active Classification based on Value of Classifier , 2011, NIPS.

[35]  Aram Kawewong,et al.  Online incremental attribute-based zero-shot learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Xin Li,et al.  Multi-level Adaptive Active Learning for Scene Classification , 2014, ECCV.

[37]  Cordelia Schmid,et al.  Accurate Object Localization with Shape Masks , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Fei-Fei Li,et al.  What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.

[39]  Chong Ho Lee,et al.  Scene Classification via Hypergraph-Based Semantic Attributes Subnetworks Identification , 2014, ECCV.

[40]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[41]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[42]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[43]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[44]  Xuelong Li,et al.  Biologically Inspired Features for Scene Classification in Video Surveillance , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  Tat-Seng Chua,et al.  Semantic-Gap-Oriented Active Learning for Multilabel Image Annotation , 2012, IEEE Transactions on Image Processing.

[46]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[47]  Ricardo da Silva Torres,et al.  Visual word spatial arrangement for image retrieval and classification , 2014, Pattern Recognit..

[48]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[49]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[50]  Nong Sang,et al.  Learning to detect contours in natural images via biologically motivated schemes , 2013, 2013 IEEE International Conference on Image Processing.

[51]  Frédéric Jurie,et al.  Modeling spatial layout with fisher vectors for image categorization , 2011, 2011 International Conference on Computer Vision.

[52]  Bin Li,et al.  A survey on instance selection for active learning , 2012, Knowledge and Information Systems.

[53]  Changyin Sun,et al.  AL-ELM: One uncertainty-based active learning algorithm using extreme learning machine , 2015, Neurocomputing.

[54]  Mubarak Shah,et al.  Learning semantic features for action recognition via diffusion maps , 2012, Comput. Vis. Image Underst..

[55]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[56]  Fei-Fei Li,et al.  Hierarchical semantic indexing for large scale image retrieval , 2011, CVPR 2011.

[57]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[58]  Ling Shao,et al.  Transfer Learning for Visual Categorization: A Survey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[59]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[60]  Fahad Shahbaz Khan,et al.  Discriminative compact pyramids for object and scene recognition , 2012, Pattern Recognition.

[61]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[62]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[63]  Luc Van Gool,et al.  TriCoS: A Tri-level Class-Discriminative Co-segmentation Method for Image Classification , 2012, ECCV.

[64]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[65]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[66]  Hong Qiao,et al.  Improving invariance in visual classification with biologically inspired mechanism , 2014, Neurocomputing.

[67]  Maozu Guo,et al.  Constructing training distribution by minimizing variance of risk criterion for visual category learning , 2012, 2012 19th IEEE International Conference on Image Processing.