Semantic context learning with large-scale weakly-labeled image set

There are a large number of images available on the web; meanwhile, only a subset of web images can be labeled by professionals because manual annotation is time-consuming and labor-intensive. Although we can now use the collaborative image tagging system, e.g., Flickr, to get a lot of tagged images provided by Internet users, these labels may be incorrect or incomplete. Furthermore, semantics richness requires more than one label to describe one image in real applications, and multiple labels usually interact with each other in semantic space. It is of significance to learn semantic context with large-scale weakly-labeled image set in the task of multi-label annotation. In this paper, we develop a novel method to learn semantic context and predict the labels of web images in a semi-supervised framework. To address the scalability issue, a small number of exemplar images are first obtained to cover the whole data cloud; then the label vector of each image is estimated as a local combination of the exemplar label vectors. Visual context, semantic context, and neighborhood consistency in both visual and semantic spaces are sufficiently leveraged in the proposed framework. Finally, the semantic context and the label confidence vectors for exemplar images are both learned in an iterative way. Experimental results on the real-world image dataset demonstrate the effectiveness of our method.

[1]  Shuicheng Yan,et al.  Efficient large-scale image annotation by probabilistic collaborative multi-label propagation , 2010, ACM Multimedia.

[2]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[3]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[4]  Ivor W. Tsang,et al.  Tag-based web photo retrieval improved by batch mode re-tagging , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Jianping Fan,et al.  Harvesting large-scale weakly-tagged image databases from the web , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[7]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[8]  Antonio Torralba,et al.  Semi-Supervised Learning in Gigantic Image Collections , 2009, NIPS.

[9]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[10]  Tao Mei,et al.  Graph-based semi-supervised learning with multiple labels , 2009, J. Vis. Commun. Image Represent..

[11]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  Jianping Fan,et al.  Automatic image annotation with weakly labeled dataset , 2011, MM '11.

[14]  Tao Mei,et al.  Joint multi-label multi-instance learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[16]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[17]  Dong Liu,et al.  Unified tag analysis with multi-edge graph , 2010, ACM Multimedia.

[18]  Antonio Torralba,et al.  Exploiting hierarchical context on a large database of object categories , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[20]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[21]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.