Semantic-Gap-Oriented Active Learning for Multilabel Image Annotation

User interaction is an effective way to handle the semantic gap problem in image annotation. To minimize user effort in the interactions, many active learning methods were proposed. These methods treat the semantic concepts individually or correlatively. However, they still neglect the key motivation of user feedback: to tackle the semantic gap. The size of the semantic gap of each concept is an important factor that affects the performance of user feedback. User should pay more efforts to the concepts with large semantic gaps, and vice versa. In this paper, we propose a semantic-gap-oriented active learning method, which incorporates the semantic gap measure into the information-minimization-based sample selection strategy. The basic learning model used in the active learning framework is an extended multilabel version of the sparse-graph-based semisupervised learning method that incorporates the semantic correlation. Extensive experiments conducted on two benchmark image data sets demonstrated the importance of bringing the semantic gap measure into the active learning process.

[1]  Kristen Grauman,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Tat-Seng Chua,et al.  Image Annotation by Graph-Based Inference With Integrated Multiple/Single Instance Representations , 2010, IEEE Transactions on Multimedia.

[3]  Charles Elkan,et al.  Using the Triangle Inequality to Accelerate k-Means , 2003, ICML.

[4]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[5]  Peter Lancaster,et al.  The theory of matrices , 1969 .

[6]  Tao Mei,et al.  Graph-based semi-supervised learning with multiple labels , 2009, J. Vis. Commun. Image Represent..

[7]  F. R. Gantmakher The Theory of Matrices , 1984 .

[8]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Rajesh P. N. Rao,et al.  Probabilistic Models of the Brain: Perception and Neural Function , 2002 .

[10]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[11]  Edward Y. Chang,et al.  Active Learning for Interactive Multimedia Retrieval , 2008, Proceedings of the IEEE.

[12]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[14]  Xian-Sheng Hua,et al.  Two-Dimensional Multilabel Active Learning with an Efficient Online Adaptation Model for Image Classification , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jingrui He,et al.  Mean version space: a new active learning method for content-based image retrieval , 2004, MIR '04.

[16]  Ramesh C. Jain,et al.  Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images , 2011, TIST.

[17]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2008, IEEE Trans. Knowl. Data Eng..

[18]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[19]  Meng Wang,et al.  Correlative Linear Neighborhood Propagation for Video Annotation , 2009, IEEE Trans. Syst. Man Cybern. Part B.

[20]  Meng Wang,et al.  Active learning in multimedia annotation and retrieval: A survey , 2011, TIST.

[21]  Stéphane Ayache,et al.  Evaluation of active learning strategies for video indexing , 2007, Signal Process. Image Commun..

[22]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[23]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, CVPR.

[24]  Sunil Arya,et al.  ANN: library for approximate nearest neighbor searching , 1998 .

[25]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[26]  Yan Song,et al.  Multi-Concept Multi-Modality Active Learning for Interactive Video Annotation , 2007 .

[27]  Thomas S. Huang,et al.  Leveraging Active Learning for Relevance Feedback Using an Information Theoretic Diversity Measure , 2006, CIVR.

[28]  Tsuhan Chen,et al.  Annotating retrieval database with active learning , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[29]  Qi Tian,et al.  Constructing Concept Lexica With Small Semantic Gaps , 2010, IEEE Transactions on Multimedia.

[30]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Yi Wu,et al.  Sampling Strategies for Active Learning in Personal Photo Retrieval , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[32]  Martin E. Hellman,et al.  Probability of error, equivocation, and the Chernoff bound , 1970, IEEE Trans. Inf. Theory.

[33]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[34]  Matthieu Cord,et al.  A comparison of active classification methods for content-based image retrieval , 2004, CVDB '04.

[35]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[36]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.