Nonlinear matrix factorization with unified embedding for social tag relevance learning

With the proliferation of social images, social image tagging is an essential issue for text-based social image retrieval. However, the original tags annotated by web users are always noisy, irrelevant and incomplete to interpret the image visual contents. In this paper, we propose a nonlinear matrix factorization method with the priors of inter- and intra-correlations among images and tags to effectively predict the tag relevance to the visual contents. In the proposed method, we attempt to discover the image latent feature space and the tag latent feature space in a unified space, that is, each image or each tag can be described as a point in the unified space. Intuitively, it is more understandable to estimate the relationships between images and tags directly based on their distances or similarities in the unified space. Thus, the task of image tagging or tag recommendation can be efficiently solved by the nearest tag-neighbors search in the unified space. Similarly, we can obtain the top relevant images corresponding to any tag so as to perform the task of image search by keywords. We investigate the performance of the proposed method on tag recommendation and image search respectively and compare to existing work on the challenging NUS-WIDE dataset. Extensive experiments demonstrate the effectiveness and potentials of the proposed method in real-world applications.

[1]  Shuicheng Yan,et al.  Multi-label sparse coding for automatic image annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[3]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[5]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.

[6]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[7]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[9]  Steven C. H. Hoi,et al.  A two-view learning approach for image tag ranking , 2011, WSDM '11.

[10]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[12]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[13]  Nenghai Yu,et al.  Learning to tag , 2009, WWW '09.

[14]  Marcel Worring,et al.  Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.

[15]  YangYi,et al.  Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval , 2008 .

[16]  Jing Liu,et al.  Image annotation using multi-correlation probabilistic matrix factorization , 2010, ACM Multimedia.

[17]  Bin Wang,et al.  Dual cross-media relevance model for image annotation , 2007, ACM Multimedia.

[18]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[19]  Marcel Worring,et al.  Unsupervised multi-feature tag relevance learning for social image retrieval , 2010, CIVR '10.

[20]  Jing Liu,et al.  Sparse constraint nearest neighbour selection in cross-media retrieval , 2010, 2010 IEEE International Conference on Image Processing.

[21]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[22]  Wenhua Wang,et al.  Local and Global Regressive Mapping for Manifold Learning with Out-of-Sample Extrapolation , 2010, AAAI.

[23]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[24]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[25]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[26]  Rongrong Ji,et al.  Cross-media manifold learning for image retrieval & annotation , 2008, MIR '08.

[27]  Shuicheng Yan,et al.  Efficient large-scale image annotation by probabilistic collaborative multi-label propagation , 2010, ACM Multimedia.

[28]  Meng Wang,et al.  News contextualization with geographic and visual information , 2011, ACM Multimedia.