Automatic Web Image Annotation via Web-Scale Image Semantic Space Learning

The correlation between keywords has been exploited to improve Automatic Image Annotation(AIA). Differing from the traditional lexicon or training data based keyword correlation estimation, we propose using Web-scale image semantic space learning to explore the keyword correlation for automatic Web image annotation. Specifically, we use the Social Media Web site: Flickr as Web scale image semantic space to determine the annotation keyword correlation graph to smooth the annotation probability estimation. To further improve Web image annotation performance, we present a novel constraint piecewise penalty weighted regression model to estimate the semantics of the Web image from the corresponding associated text. We integrate the proposed approaches into our Web image annotation framework and conduct experiments on a real Web image data set. The experimental results show that both of our approaches can improve the annotation performance significantly.

[1]  Nenghai Yu,et al.  Image Annotation in a Progressive Way , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[2]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[3]  Qi Zhang,et al.  Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching , 2007, CIVR '07.

[4]  Dan I. Moldovan,et al.  Exploiting ontologies for automatic image annotation , 2005, SIGIR '05.

[5]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[6]  Mark D. Dunlop,et al.  Image retrieval by hypertext links , 1997, SIGIR '97.

[7]  Xiangdong Zhou,et al.  WISA: a novel web image semantic analysis system , 2008, SIGIR '08.

[8]  Jing Hua,et al.  Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[10]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[11]  ChengXiang Zhai,et al.  A general optimization framework for smoothing language models on graph structures , 2008, SIGIR '08.

[12]  Yang Song,et al.  Real-time automatic tag recommendation , 2008, SIGIR '08.

[13]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[14]  Meng Wang,et al.  Structure-sensitive manifold ranking for video concept detection , 2007, ACM Multimedia.

[15]  Tat-Seng Chua,et al.  A bootstrapping framework for annotating and retrieving WWW images , 2004, MULTIMEDIA '04.

[16]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[17]  Vincent S. Tseng,et al.  Web image annotation by fusing visual features and textual information , 2007, SAC '07.

[18]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[19]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  Luo Si,et al.  Effective automatic image annotation via a coherent language model and active learning , 2004, MULTIMEDIA '04.

[21]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[22]  Wei-Ying Ma,et al.  Image annotation by large-scale content-based image retrieval , 2006, MM '06.