Cross-media retrieval using probabilistic model of automatic image annotation

In recent years, automatic image annotation (AIA) has been applied to cross-media retrieval usually due to its advantage of mining correlations of images and annotation texts efficiently. However, some AIA methods just annotate images as a unit and the accuracy of annotation may not be acceptable. In this paper, we propose a kind of probabilistic model which may assign keywords to an un-annotated image automatically based on a training dataset of images. Images in the training dataset are segmented into regions and a kind of vocabulary called blob is used to represent these image regions. Blobs are generated by using K-Means algorithm to cluster these image regions. Through this model, we can predict the probability of assigning a keyword into a blob. After the accomplishment of annotation, a keyword corresponds to one image region. Furthermore, the feature vectors of text documents are generated by TF.IDF method and images’ automatic annotation information is used to retrieve relevant text documents. Experiments on the IAPR TC-12 dataset and 500 Wikipedia webpages about landscape show the usefulness of applying probabilistic model of AIA to the cross-media retrieval.

[1]  Yueting Zhuang,et al.  Learning Semantic Correlations for Cross-Media Retrieval , 2006, 2006 International Conference on Image Processing.

[2]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[3]  Liana Stanescu,et al.  Automatic image annotation and semantic based image retrieval for medical domain , 2013, Neurocomputing.

[4]  Ye Yuan,et al.  A Novel Approach Towards Large Scale Cross-Media Retrieval , 2012, Journal of Computer Science and Technology.

[5]  Yueting Zhuang,et al.  Cross-modal correlation learning for clustering on image-audio dataset , 2007, ACM Multimedia.

[6]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[7]  Radu Popescu-Zeletin,et al.  Proceedings of the ninth ACM international conference on Multimedia , 2001, MM 2001.

[8]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[9]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[10]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[11]  Yang Yu,et al.  Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[13]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[14]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Wei-Ying Ma,et al.  Learning an image manifold for retrieval , 2004, MULTIMEDIA '04.

[17]  Rashid Ali,et al.  Combining visual features of an image at different precision value of unsupervised content based image retrieval , 2010, 2010 IEEE International Conference on Computational Intelligence and Computing Research.

[18]  Kadir A. Peker,et al.  Binary SIFT: Fast image retrieval using binary quantized SIFT features , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[19]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[20]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[21]  Christos Faloutsos,et al.  Automatic multimedia cross-modal correlation discovery , 2004, KDD.

[22]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[23]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[24]  Jing Guo,et al.  Cross-Media Image Retrieval via Latent Semantic Indexing and Mixed Bagging , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[25]  Yixin Chen,et al.  CLUE: cluster-based retrieval of images by unsupervised learning , 2005, IEEE Transactions on Image Processing.

[26]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[27]  B. B. Meshram,et al.  Content based video retrieval , 2012, ArXiv.