Image tag refinement towards low-rank, content-tag prior and error sparsity

The vast user-provided image tags on the popular photo sharing websites may greatly facilitate image retrieval and management. However, these tags are often imprecise and/or incomplete, resulting in unsatisfactory performances in tag related applications. In this work, the tag refinement problem is formulated as a decomposition of the user-provided tag matrix D into a low-rank refined matrix A and a sparse error matrix E, namely D = A + E, targeting the optimality measured by four aspects: 1) low-rank: A is of low-rank owing to the semantic correlations among the tags; 2) content consistency: if two images are visually similar, their tag vectors (i.e., column vectors of A) should also be similar; 3) tag correlation: if two tags co-occur with high frequency in general images, their co-occurrence frequency (described by two row vectors of A) should also be high; and 4) error sparsity: the matrix E is sparse since the tag matrix D is sparse and also humans can provide reasonably accurate tags. All these components finally constitute a constrained yet convex optimization problem, and an efficient convergence provable iterative procedure is proposed for the optimization based on accelerated proximal gradient method. Extensive experiments on two benchmark Flickr datasets, with 25K and 270K images respectively, well demonstrate the effectiveness of the proposed tag refinement approach.

[1]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[2]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[3]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[4]  R. Larsen Lanczos Bidiagonalization With Partial Reorthogonalization , 1998 .

[5]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[6]  Yi Liu,et al.  Semi-supervised Multi-label Learning by Constrained Non-negative Matrix Factorization , 2006, AAAI.

[7]  Wei-Ying Ma,et al.  Image annotation by large-scale content-based image retrieval , 2006, MM '06.

[8]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[9]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  William I. Grosky,et al.  Narrowing the semantic gap - improved text-based web document retrieval using visual features , 2002, IEEE Trans. Multim..

[11]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[12]  Changhu Wang,et al.  Scalable search-based image annotation , 2008, Multimedia Systems.

[13]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[14]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[15]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[18]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Hao Xu,et al.  Tag refinement by regularized LDA , 2009, ACM Multimedia.

[20]  Nenghai Yu,et al.  Multi-graph similarity reinforcement for image annotation refinement , 2008, 2008 15th IEEE International Conference on Image Processing.

[21]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[22]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[23]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[24]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[25]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[26]  Arvind Ganesh,et al.  Fast Convex Optimization Algorithms for Exact Recovery of a Corrupted Low-Rank Matrix , 2009 .