论文信息 - Automatic image annotation using inverse maps from semantic embeddings

Automatic image annotation using inverse maps from semantic embeddings

Human annotation in large scale image databases is time-consuming and error-prone. Since it is very hard to mine image databases using just visual features or textual descriptors, it is common to transform the image features into a semantically meaningful space. In this paper, we propose to perform image annotation in a semantic space inferred based on sparse representations. By constructing a semantic embedding for the visual features, that is constrained to be close to the tag embedding, we show that a robust inverse map can be used to predict the tags. Experiments using standard datasets show the effectiveness of the proposed approach in automatic image annotation when compared to existing methods.

Karthikeyan Natesan Ramamurthy | Andreas Spanias | Peer-Timo Bremer | Jayaraman J. Thiagarajan | Prasanna Sattigeri

[1] R. Manmatha,et al. Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2] Bengt Fornberg,et al. Inverting Non-Linear Dimensionality Reduction with Scale-Free Radial Basis Interpolation , 2013, ArXiv.

[3] R. Manmatha,et al. A Model for Learning the Semantics of Pictures , 2003, NIPS.

[4] W. Boothby. An introduction to differentiable manifolds and Riemannian geometry , 1975 .

[5] Wotao Yin,et al. A feasible method for optimization with orthogonality constraints , 2013, Math. Program..

[6] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[7] Gustavo Carneiro,et al. Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Edward Y. Chang,et al. CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[9] R. Manmatha,et al. Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[10] Vladimir Pavlovic,et al. A New Baseline for Image Annotation , 2008, ECCV.

[11] Honggang Zhang,et al. Reference-Based Scheme Combined With K-SVD for Scene Image Categorization , 2013, IEEE Signal Processing Letters.

[12] Lei Zhang,et al. Multi-label sparse coding for automatic image annotation , 2009, CVPR.

[13] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.

[14] David A. Forsyth,et al. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[15] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.

[16] Raimondo Schettini,et al. Image annotation using SVM , 2003, IS&T/SPIE Electronic Imaging.

[17] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[18] Yang Yu,et al. Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.