The Impact of Typicality for Informative Representative Selection

In computer vision, selection of the most informative samples from a huge pool of training data in order to learn a good recognition model is an active research problem. Furthermore, it is also useful to reduce the annotation cost, as it is time consuming to annotate unlabeled samples. In this paper, motivated by the theories in data compression, we propose a novel sample selection strategy which exploits the concept of typicality from the domain of information theory. Typicality is a simple and powerful technique which can be applied to compress the training data to learn a good classification model. In this work, typicality is used to identify a subset of the most informative samples for labeling, which is then used to update the model using active learning. The proposed model can take advantage of the inter-relationships between data samples. Our approach leads to a significant reduction of manual labeling cost while achieving similar or better recognition performance compared to a model trained with entire training set. This is demonstrated through rigorous experimentation on five datasets.

[1]  Amit K. Roy-Chowdhury,et al.  Non-uniform Subset Selection for Active Learning in Structured Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xin Li,et al.  Multi-level Adaptive Active Learning for Scene Classification , 2014, ECCV.

[3]  Allen Y. Yang,et al.  A Convex Optimization Framework for Active Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  Dong ping Tian,et al.  A Review on Image Feature Extraction and Representation Techniques , 2013 .

[5]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[6]  Michael S. Bernstein,et al.  Scalable multi-label annotation , 2014, CHI.

[7]  Kristen Grauman,et al.  Large-scale live active learning: Training object detectors with crawled data and crowds , 2011, CVPR.

[8]  Gregory J Zelinsky,et al.  Effects of target typicality on categorical search. , 2014, Journal of vision.

[9]  Joachim Denzler,et al.  Active learning and discovery of object categories in the presence of unnameable instances , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[11]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[12]  Bart Selman,et al.  Unstructured human activity detection from RGBD images , 2011, 2012 IEEE International Conference on Robotics and Automation.

[13]  Babak Saleh,et al.  The Role of Typicality in Object Classification: Improving The Generalization Capacity of Convolutional Neural Networks , 2016, ArXiv.

[14]  Mohammed Bennamoun,et al.  A Spatial Layout and Scale Invariant Feature Representation for Indoor Scene Classification , 2015, IEEE Transactions on Image Processing.

[15]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Antonio J. Plaza,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 1 Spectral–Spatial Classification of Hyperspectral Data Usi , 2022 .

[18]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[19]  John Folkesson,et al.  Relational Approaches for Joint Object Classification and Scene Similarity Measurement in Indoor Environments , 2014, AAAI Spring Symposia.

[20]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[21]  Hema Swetha Koppula,et al.  Learning human activities and object affordances from RGB-D videos , 2012, Int. J. Robotics Res..

[22]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[23]  Anima Anandkumar,et al.  Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models , 2015, ArXiv.

[24]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[25]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[26]  James M. Rehg,et al.  Combining Self Training and Active Learning for Video Segmentation , 2011, BMVC.

[27]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Deva Ramanan,et al.  Video Annotation and Tracking with Active Learning , 2011, NIPS.

[29]  Amit K. Roy-Chowdhury,et al.  Online Adaptation for Joint Scene and Object Classification , 2016, ECCV.

[30]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[31]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[32]  Jan Kautz,et al.  Hierarchical Subquery Evaluation for Active Learning on a Graph , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Amit K. Roy-Chowdhury,et al.  Context Aware Active Learning of Activity Recognition Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Antonio Torralba,et al.  Are all training examples equally valuable? , 2013, ArXiv.

[36]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[37]  Huan Liu,et al.  ActNeT: Active Learning for Networked Texts in Microblogging , 2013, SDM.

[38]  Alexei A. Efros,et al.  Improving Spatial Support for Objects via Multiple Segmentations , 2007, BMVC.

[39]  Amit K. Roy-Chowdhury,et al.  Incremental Activity Modeling and Recognition in Streaming Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Ling Shao,et al.  Learning Object-to-Class Kernels for Scene Classification , 2014, IEEE Transactions on Image Processing.

[41]  程俊,et al.  Incorporating incremental and active learning for scene classification , 2012 .

[42]  Bernt Schiele,et al.  A Semantic Typicality Measure for Natural Scene Categorization , 2004, DAGM-Symposium.

[43]  Sanja Fidler,et al.  Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[45]  Alexei A. Efros,et al.  Mid-level Visual Element Discovery as Discriminative Mode Seeking , 2013, NIPS.

[46]  Silvio Savarese,et al.  Learning context for collective activity recognition , 2011, CVPR 2011.

[47]  Andrew McCallum,et al.  Active Learning by Labeling Features , 2009, EMNLP.

[48]  Xin Li,et al.  Adaptive Active Learning for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.