An Interactive Semi-supervised Approach for Automatic Image Annotation

Automatic image annotation (AIA) is an effective technique to bridge the semantic gap between low level image features and high level semantics. However, most of the existing AIA approaches failed to consider the use of unlabeled data. In this paper, we present an interactive semi-supervised approach for AIA by integrating graph propagation model and kernel canonical correlation analysis (KCCA) together. We aim to jointly utilize the keywords associated with labeled and selected unlabeled images to annotate the residual unlabeled images. Toward this goal, we firstly estimate the annotations of unlabeled images by the consistency-driven graph propagation model. Then, the KCCA is applied to seek the semantic consistency between the two concurrent visual and textual features. In addition, the unlabeled image with highest semantic consistency is selected into the training set. Thus, with the enlarged training set, the potential of the semantic consistency between visual and textual representations could be boosted. Some experiments carried out on two standard databases validate the effectiveness of the proposed method.

[1]  Daniel Gatica-Perez,et al.  PLSA-based image auto-annotation: constraining the latent space , 2004, MULTIMEDIA '04.

[2]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Zhi-Hua Zhou,et al.  Ensemble approach based on conditional random field for multi-label image and video annotation , 2011, ACM Multimedia.

[4]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[5]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[6]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[7]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[8]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, CVPR 2004.

[9]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[10]  Vasant G Honavar,et al.  Annotating images and image objects using a hierarchical dirichlet process model , 2008, MDM '08.

[11]  John Shawe-Taylor,et al.  A Correlation Approach for Automatic Image Annotation , 2006, ADMA.

[12]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[13]  Yang Yu,et al.  Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[15]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[17]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[18]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[19]  Wei-Ying Ma,et al.  Manifold-Ranking-Based Keyword Propagation for Image Retrieval , 2006, EURASIP J. Adv. Signal Process..

[20]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.