Automatic tagging by exploring tag information capability and correlation

Automatic tagging can automatically label images and videos with semantic tags to significantly facilitate multimedia search and organization. However, most of existing tagging algorithms often don’t differentiate between tags used to describe visual content, and neglect the semantic correlation of the assigned tag set. In this paper, we propose a novel automatic tagging algorithm which tags a test image or video with an Informative and Correlative Tag (ICTag) set. The assigned ICTag set can provide a more precise description of the multimedia object by exploring both the information capability of individual tags and the tag-to-set correlation. Measures to effectively estimate the information capability of individual tags and the correlation between a tag and the candidate tag set are designed. To reduce the computational complexity, we also introduce a heuristic method to achieve efficient automatic tagging. We conduct extensive experiments on the NUS-WIDE web image dataset downloaded from Flickr and the MCG-WEBV web video dataset downloaded from YouTube. The results confirm the efficiency and effectiveness of our proposed algorithm.

[1]  Nenghai Yu,et al.  Learning to tag , 2009, WWW '09.

[2]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[3]  Mark Sanderson,et al.  Automatic video tagging using content redundancy , 2009, SIGIR.

[4]  A. Robert Calderbank,et al.  Content-Aware Distortion-Fair Video Streaming in Congested Networks , 2009, IEEE Transactions on Multimedia.

[5]  Qi Zhang,et al.  Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching , 2007, CIVR '07.

[6]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[7]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[9]  Qi Tian,et al.  Multi-label boosting for image annotation by structural grouping sparsity , 2010, ACM Multimedia.

[10]  Jianping Fan,et al.  Leveraging loosely-tagged images and inter-object correlations for tag recommendation , 2010, ACM Multimedia.

[11]  Bin Wang,et al.  Dual cross-media relevance model for image annotation , 2007, ACM Multimedia.

[12]  Jiaheng Lu,et al.  Clustering Web video search results based on integration of multiple features , 2010, World Wide Web.

[13]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[14]  Zi Huang,et al.  Tag localization with spatial correlations and joint group sparsity , 2011, CVPR 2011.

[15]  B. S. Manjunath,et al.  Video Annotation Through Search and Graph Reinforcement Mining , 2010, IEEE Transactions on Multimedia.

[16]  Edward Y. Chang,et al.  Pfp: parallel fp-growth for query recommendation , 2008, RecSys '08.

[17]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[18]  Zi Huang,et al.  Mining multi-tag association for image tagging , 2011, World Wide Web.

[19]  Rong Yan,et al.  Query expansion using probabilistic local feedback with application to multimedia retrieval , 2007, CIKM '07.

[20]  Dong Liu,et al.  Unified tag analysis with multi-edge graph , 2010, ACM Multimedia.

[21]  Yong Wang,et al.  Coherent image annotation by learning semantic distance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Shuicheng Yan,et al.  Efficient large-scale image annotation by probabilistic collaborative multi-label propagation , 2010, ACM Multimedia.

[23]  Jianchang Mao,et al.  Towards the Semantic Web: Collaborative Tag Suggestions , 2006 .

[24]  Wei-Ying Ma,et al.  An adaptive graph model for automatic image annotation , 2006, MIR '06.

[25]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[26]  Gilad Mishne,et al.  AutoTag: a collaborative approach to automated tag assignment for weblog posts , 2006, WWW '06.

[27]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[28]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[30]  Anthony K. H. Tung,et al.  Multiple feature fusion for social media applications , 2010, SIGMOD Conference.

[31]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[32]  Yongdong Zhang,et al.  VideoMap: an interactive video retrieval system of MCG-ICT-CAS , 2009, CIVR '09.

[33]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[34]  Nick Koudas,et al.  Improved Search for Socially Annotated Data , 2009, Proc. VLDB Endow..

[35]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[36]  Chong-Wah Ngo,et al.  Semantic context modeling with maximal margin Conditional Random Fields for automatic image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[38]  Xian-Sheng Hua,et al.  Collaborative learning for image and video annotation , 2008, MIR '08.

[39]  Qi Tian,et al.  What are the high-level concepts with small semantic gaps? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Nenghai Yu,et al.  Distance metric learning from uncertain side information with application to automated photo tagging , 2009, ACM Multimedia.

[41]  Marcel Worring,et al.  Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.