Towards optimizing human labeling for interactive image tagging

Interactive tagging is an approach that combines human and computer to assign descriptive keywords to image contents in a semi-automatic way. It can avoid the problems in automatic tagging and pure manual tagging by achieving a compromise between tagging performance and manual cost. However, conventional research efforts on interactive tagging mainly focus on sample selection and models for tag prediction. In this work, we investigate interactive tagging from a different aspect. We introduce an interactive image tagging framework that can more fully make use of human's labeling efforts. That means, it can achieve a specified tagging performance by taking less manual labeling effort or achieve better tagging performance with a specified labeling cost. In the framework, hashing is used to enable a quick clustering of image regions and a dynamic multiscale clustering labeling strategy is proposed such that users can label a large group of similar regions each time. We also employ a tag refinement method such that several inappropriate tags can be automatically corrected. Experiments on a large dataset demonstrate the effectiveness of our approach

[1]  Shuicheng Yan,et al.  Weakly-supervised hashing in kernel space , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[3]  Rong Yan,et al.  Hybrid Tagging and Browsing Approaches for Efficient Manual Image Annotation , 2009, IEEE MultiMedia.

[4]  Ramesh C. Jain,et al.  Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images , 2011, TIST.

[5]  Yuandong Tian,et al.  EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking , 2007, CHI.

[6]  Mary Czerwinski,et al.  Semi-Automatic Image Annotation , 2001, INTERACT.

[7]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[8]  Benjamin B. Bederson,et al.  Semi-automatic photo annotation strategies using event based clustering and clothing based person recognition , 2007, Interact. Comput..

[9]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[11]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[12]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Yuandong Tian,et al.  A Face Annotation Framework with Partial Clustering and Interactive Labeling , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Tat-Seng Chua,et al.  Semantic-Gap-Oriented Active Learning for Multilabel Image Annotation , 2012, IEEE Transactions on Image Processing.

[15]  Mingjing Li,et al.  Automated annotation of human faces in family albums , 2003, MULTIMEDIA '03.

[16]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, CVPR.

[17]  Nasser Kehtarnavaz,et al.  Determining number of clusters and prototype locations via multi-scale clustering , 1998, Pattern Recognit. Lett..

[18]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Nigel Shadbolt,et al.  Image annotation with Photocopain , 2006 .

[20]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[21]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[22]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[23]  Newton Lee,et al.  ACM Transactions on Multimedia Computing, Communications and Applications (ACM TOMCCAP) , 2007, CIE.

[24]  Yanfeng Sun,et al.  MiAlbum - a system for home photo managemet using the semi-automatic image annotation approach , 2000, MM 2000.

[25]  Wesley De Neve,et al.  Image tag refinement along the ‘what’ dimension using tag categorization and neighbor voting , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[26]  Benjamin B. Bederson,et al.  Semi-Automatic Image Annotation Using Event and Torso Identification , 2004 .

[27]  Meng Wang,et al.  Active tagging for image indexing , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[28]  Ramesh C. Jain,et al.  One person labels one million images , 2010, ACM Multimedia.

[29]  Rong Yan,et al.  Extreme video retrieval: joint maximization of human and computer performance , 2006, MM '06.

[30]  Tao Mei,et al.  Image Decomposition With Multilabel Context: Algorithms and Applications , 2011, IEEE Transactions on Image Processing.

[31]  Edward Y. Chang,et al.  Active Learning for Interactive Multimedia Retrieval , 2008, Proceedings of the IEEE.

[32]  Hao Xu,et al.  Tag refinement by regularized LDA , 2009, ACM Multimedia.

[33]  Michael L. Creech,et al.  FotoFile: a consumer multimedia organization and retrieval system , 1999, CHI '99.

[34]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[35]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[36]  Dong Liu,et al.  Smart batch tagging of photo albums , 2009, MM '09.

[37]  John Adcock,et al.  Leveraging face recognition technology to find and organize photos , 2004, MIR '04.

[38]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.