Learning to Rank Image Tags With Limited Training Examples

With an increasing number of images that are available in social media, image annotation has emerged as an important research topic due to its application in image matching and retrieval. Most studies cast image annotation into a multilabel classification problem. The main shortcoming of this approach is that it requires a large number of training images with clean and complete annotations in order to learn a reliable model for tag prediction. We address this limitation by developing a novel approach that combines the strength of tag ranking with the power of matrix recovery. Instead of having to make a binary decision for each tag, our approach ranks tags in the descending order of their relevance to the given image, significantly simplifying the problem. In addition, the proposed method aggregates the prediction models for different tags into a matrix, and casts tag ranking into a matrix recovery problem. It introduces the matrix trace norm to explicitly control the model complexity, so that a reliable prediction model can be learned for tag ranking even when the tag space is large and the number of training images is limited. Experiments on multiple well-known image data sets demonstrate the effectiveness of the proposed framework for tag ranking compared with the state-of-the-art approaches for image annotation and tag ranking.

[1]  Rong Jin,et al.  Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Vladimir Pavlovic,et al.  Baselines for Image Annotation , 2010, International Journal of Computer Vision.

[3]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Shuicheng Yan,et al.  Multi-label sparse coding for automatic image annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[6]  Chris H. Q. Ding,et al.  Image annotation using multi-label correlated Green's function , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[8]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[9]  Dong Liu,et al.  Content-based tag processing for Internet social images , 2010, Multimedia Tools and Applications.

[10]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[12]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[13]  Marcel Worring,et al.  Unsupervised multi-feature tag relevance learning for social image retrieval , 2010, CIVR '10.

[14]  Feiping Nie,et al.  New Graph Structured Sparsity Model for Multi-label Image Annotations , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Bingbing Ni,et al.  Assistive tagging: A survey of multimedia tagging with human-computer joint exploration , 2012, CSUR.

[17]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[18]  Changsheng Xu,et al.  Low-Rank Sparse Coding for Image Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Jianping Fan,et al.  Structured Max-Margin Learning for Inter-Related Classifier Training and Multilabel Image Annotation , 2011, IEEE Transactions on Image Processing.

[20]  Changsheng Xu,et al.  MLRank: Multi-correlation Learning to Rank for image annotation , 2013, Pattern Recognit..

[21]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[22]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[23]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Daniel Gatica-Perez,et al.  PLSA-based image auto-annotation: constraining the latent space , 2004, MULTIMEDIA '04.

[25]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Kilian Q. Weinberger,et al.  Fast Image Tagging , 2013, ICML.

[27]  Weifeng Liu,et al.  Multiview Hessian Regularization for Image Annotation , 2013, IEEE Transactions on Image Processing.

[28]  Rong Jin,et al.  Multi-label learning with incomplete class assignments , 2011, CVPR 2011.

[29]  Shuicheng Yan,et al.  Learning to rank tags , 2010, CIVR '10.

[30]  Farshad Fotouhi,et al.  Region based image annotation through multiple-instance learning , 2005, MULTIMEDIA '05.

[31]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[32]  Yifan Zhang,et al.  Correlation consistency constrained probabilistic matrix factorization for social tag refinement , 2013, Neurocomputing.

[33]  James Hays,et al.  SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Vasant G Honavar,et al.  Annotating images and image objects using a hierarchical dirichlet process model , 2008, MDM '08.

[35]  Hagai Attias,et al.  Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Kristen Grauman,et al.  Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search , 2011, International Journal of Computer Vision.

[37]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[38]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[39]  Steven C. H. Hoi,et al.  A two-view learning approach for image tag ranking , 2011, WSDM '11.

[40]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[41]  Yang Yu,et al.  Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[43]  Hong Shen,et al.  Learning a hybrid similarity measure for image retrieval , 2013, Pattern Recognit..

[44]  Larry S. Davis,et al.  Learning Structured Low-Rank Representations for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[46]  Rong Jin,et al.  Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[47]  Yi Yang,et al.  Web and Personal Image Annotation by Mining Label Correlation With Relaxed Visual Graph Embedding , 2012, IEEE Transactions on Image Processing.

[48]  Ivor W. Tsang,et al.  Objective-Guided Image Annotation , 2013, IEEE Transactions on Image Processing.

[49]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[50]  Ramesh C. Jain,et al.  Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images , 2011, TIST.

[51]  Greg Mori,et al.  A Max-Margin Riffled Independence Model for Image Tag Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  C. V. Jawahar,et al.  Image Annotation Using Metric Learning in Semantic Neighbourhoods , 2012, ECCV.

[53]  Haojie Li,et al.  Tag ranking by propagating relevance over tag and image graphs , 2012, ICIMCS '12.

[54]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[55]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[56]  Raimondo Schettini,et al.  Image annotation using SVM , 2003, IS&T/SPIE Electronic Imaging.

[57]  Rong Jin,et al.  Efficient multi-label ranking for multi-class learning: Application to object recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[58]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[59]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[60]  Lei Wu,et al.  Tag Completion for Image Retrieval , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.