Actively selecting annotations among objects and attributes

We present an active learning approach to choose image annotation requests among both object category labels and the objects' attribute labels. The goal is to solicit those labels that will best use human effort when training a multi-class object recognition model. In contrast to previous work in active visual category learning, our approach directly exploits the dependencies between human-nameable visual attributes and the objects they describe, shifting its requests in either label space accordingly. We adopt a discriminative latent model that captures object-attribute and attribute-attribute relationships, and then define a suitable entropy reduction selection criterion to predict the influence a new label might have throughout those connections. On three challenging datasets, we demonstrate that the method can more successfully accelerate object learning relative to both passive learning and traditional active learning approaches.

[1]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[2]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[3]  Tsuhan Chen,et al.  An active learning framework for content-based information retrieval , 2002, IEEE Trans. Multim..

[4]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[5]  Hema Raghavan,et al.  InterActive Feature Selection , 2005, IJCAI.

[6]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[7]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[8]  Kristen Grauman,et al.  Multi-Level Active Prediction of Useful Image Annotations for Recognition , 2008, NIPS.

[9]  Xian-Sheng Hua,et al.  Two-Dimensional Active Learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Larry S. Davis,et al.  Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers , 2008, ECCV.

[11]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, CVPR.

[14]  Andrew McCallum,et al.  Active Learning by Labeling Features , 2009, EMNLP.

[15]  Kristen Grauman,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, CVPR.

[16]  Thierry Artières,et al.  Large margin training for hidden Markov models with partially observed states , 2009, ICML '09.

[17]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[18]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[19]  Ashish Kapoor,et al.  Active learning for large multi-class problems , 2009, CVPR.

[20]  Gang Wang,et al.  Joint learning of visual attributes, object classes and visual saliency , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Greg Mori,et al.  Max-margin hidden conditional random fields for human action recognition , 2009, CVPR.

[25]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Bernt Schiele,et al.  What Helps Where \textendash And Why? Semantic Relatedness for Knowledge Transfer , 2010, CVPR 2010.

[27]  Ali Farhadi,et al.  The benefits and challenges of collecting richer object annotations , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[28]  Bernt Schiele,et al.  What helps where – and why? Semantic relatedness for knowledge transfer , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Abhinav Gupta,et al.  Beyond active noun tagging: Modeling contextual interactions for multi-class active learning , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[31]  Pietro Perona,et al.  Visual Recognition with Humans in the Loop , 2010, ECCV.

[32]  Yang Wang,et al.  A Discriminative Latent Model of Object Classes and Attributes , 2010, ECCV.