Tagging image by merging multiple features in a integrated manner

Image tagging is a task that automatically assigns the query image with semantic keywords called tags, which significantly facilitates image search and organization. Since tags and image visual content are represented in different feature space, how to merge the multiple features by their correlation to tag the query image is an important problem. However, most of existing approaches merge the features by using a relatively simple mechanism rather than fully exploiting the correlations between different features. In this paper, we propose a new approach to fusing different features and their correlation simultaneously for image tagging. Specifically, we employ a Feature Correlation Graph to capture the correlations between different features in an integrated manner, which take features as nodes and their correlations as edges. Then, a revised probabilistic model based on Markov Random Field is used to describe the graph for evaluating tag’s relevance to query image. Based on that, we design an image tagging algorithm for large scale web image dataset. We evaluate our approach using two large real-life corpuses collected from Flickr, and the experimental results indicate the superiority of our proposed approach over state-of-the-art techniques.

[1]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Gang Chen,et al.  Semi-supervised Multi-label Learning by Solving a Sylvester Equation , 2008, SDM.

[3]  Anthony K. H. Tung,et al.  Multiple feature fusion for social media applications , 2010, SIGMOD Conference.

[4]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[5]  Bin Wang,et al.  Dual cross-media relevance model for image annotation , 2007, ACM Multimedia.

[6]  Qi Zhang,et al.  Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching , 2007, CIVR '07.

[7]  Naphtali Rishe,et al.  Content-based image retrieval , 1995, Multimedia Tools and Applications.

[8]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[9]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[10]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[11]  Chong-Wah Ngo,et al.  Semantic context modeling with maximal margin Conditional Random Fields for automatic image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Jianping Fan,et al.  Leveraging loosely-tagged images and inter-object correlations for tag recommendation , 2010, ACM Multimedia.

[13]  Chong-Wah Ngo,et al.  A revisit of Generative Model for Automatic Image Annotation using Markov Random Fields , 2009, CVPR.

[14]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[15]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  W. Bruce Croft,et al.  Direct Maximization of Rank-Based Metrics for Information Retrieval , 2005 .

[17]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[18]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[19]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[20]  R. Manmatha,et al.  A discrete direct retrieval model for image and video retrieval , 2008, CIVR '08.

[21]  Xian-Sheng Hua,et al.  Collaborative learning for image and video annotation , 2008, MIR '08.

[22]  Meng Wang,et al.  Correlative Linear Neighborhood Propagation for Video Annotation , 2009, IEEE Trans. Syst. Man Cybern. Part B.

[23]  Mark Sanderson,et al.  Automatic video tagging using content redundancy , 2009, SIGIR.

[24]  Nenghai Yu,et al.  Visual language modeling for image classification , 2007, MIR '07.

[25]  J. Laurie Snell,et al.  I. The Ising model , 1980 .

[26]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[27]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[28]  Nenghai Yu,et al.  Distance metric learning from uncertain side information with application to automated photo tagging , 2009, ACM Multimedia.

[29]  Qi Tian,et al.  Multi-label boosting for image annotation by structural grouping sparsity , 2010, ACM Multimedia.

[30]  Yinghui Xu,et al.  Automatic image tagging as a random walk with priors on the canonical correlation subspace , 2008, MIR '08.

[31]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[32]  Nenghai Yu,et al.  Learning to tag , 2009, WWW '09.

[33]  Chong-Wah Ngo,et al.  A revisit of Generative Model for Automatic Image Annotation using Markov Random Fields , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[35]  Wei-Ying Ma,et al.  An adaptive graph model for automatic image annotation , 2006, MIR '06.

[36]  Sourav S. Bhowmick,et al.  Quantifying tag representativeness of visual content of social images , 2010, ACM Multimedia.

[37]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[38]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..