Tag suggestion and localization for web videos by bipartite graph matching

In this paper, we formulate video tagging as a bipartite graph matching problem. Starting from existing tags that were originally provided by video owners, we conduct keyword-based image search on Flickr. Tags associated with the retrieved images are collected as candidate tags for tag suggestion. Relationships between keyframes extracted from the same video shot and candidate tags are then described as a bipartite graph, and best matching between two disjoint sets is accordingly determined to suggest new tags to this video shot. In constructing the bipartite graph, visual characteristics in terms of the bag of word model and tagging behaviors are jointly considered. Experimental results demonstrate that the proposed features and methodology achieves superior performance over previous approaches.

[1]  Yongdong Zhang,et al.  Context-oriented web video tag recommendation , 2010, WWW '10.

[2]  Rong Yan,et al.  A learning-based hybrid tagging and browsing approach for efficient manual image annotation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Jianping Fan,et al.  Leveraging loosely-tagged images and inter-object correlations for tag recommendation , 2010, ACM Multimedia.

[4]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[5]  Nenghai Yu,et al.  Distance metric learning from uncertain side information with application to automated photo tagging , 2009, ACM Multimedia.

[6]  Adrian Ulges,et al.  Content-based Video Tagging for Online Video Portals ∗ , 2007 .

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[9]  Mark Sanderson,et al.  Automatic video tagging using content redundancy , 2009, SIGIR.

[10]  Dong Liu,et al.  Semi-Automatic Tagging of Photo Albums via Exemplar Selection and Tag Inference , 2011, IEEE Transactions on Multimedia.

[11]  Sourav S. Bhowmick,et al.  Image tag clarity: in search of visual-representative tags for social images , 2009, WSM@MM.

[12]  Dong Liu,et al.  Image Retagging Using Collaborative Tag Propagation , 2011, IEEE Transactions on Multimedia.

[13]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[14]  Yongdong Zhang,et al.  Web video retagging , 2011, Multimedia Tools and Applications.

[15]  Alberto Del Bimbo,et al.  Tag suggestion and localization in user-generated videos based on social knowledge , 2010, WSM@MM.

[16]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[17]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[18]  Ning Zhou,et al.  A Hybrid Probabilistic Model for Unified Collaborative and Content-Based Image Tagging , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[20]  Wen Gao,et al.  Sequence Multi-Labeling: A Unified Video Annotation Scheme With Spatial and Temporal Context , 2010, IEEE Transactions on Multimedia.

[21]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[22]  Meng Wang,et al.  Tag Tagging: Towards More Descriptive Keywords of Image Content , 2011, IEEE Transactions on Multimedia.

[23]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.