TSVM-HMM: Transductive SVM based hidden Markov model for automatic image annotation

Automatic image annotation (AIA) is an effective technology to improve the performance of image retrieval. In this paper, we propose a novel AIA scheme based on hidden Markov model (HMM). Compared with the previous HMM-based annotation methods, SVM based semi-supervised learning, i.e. transductive SVM (TSVM), is triggered out for remarkably boosting the reliability of HMM with less users' labeling effort involved (denoted by TSVM-HMM). This guarantees that the proposed TSVM-HMM based annotation scheme integrates the discriminative classification with the generative model to mutually complete their advantages. In addition, not only the relevance model between the visual content of images and the textual keywords but also the property of keyword correlation is exploited in the proposed AIA scheme. Particularly, to establish an enhanced correlation network among keywords, both co-occurrence based and WordNet based correlation techniques are well fused and are able to be helpful for benefiting from each other. The final experimental results reveal that the better annotation performance can be achieved at less labeled training images.

[1]  Andrew Zisserman,et al.  OBJ CUT , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Jing Hua,et al.  Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Edward Y. Chang,et al.  Using one-class and two-class SVMs for multiclass image annotation , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4]  Wei-Ying Ma,et al.  A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva , 2005, ICCV.

[5]  Sanjeev Khudanpur,et al.  Hidden Markov models for automatic annotation and content-based retrieval of images and video , 2005, SIGIR '05.

[6]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[7]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[9]  Wei-Ying Ma,et al.  A probabilistic semantic model for image annotation and multi-modal image retrieval , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[11]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[13]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[14]  Yun Chen,et al.  Machine learning techniques for business blog search and mining , 2008, Expert Syst. Appl..

[15]  Jitendra Malik,et al.  Normalized Cut and Image Segmentation , 1997 .