Integrating distance metric learning into label propagation model for multi-label image annotation

Existing approaches for automatic image annotation usually suffer from two issues: (1) lacking a good quality distance metric for image semantic similarity measure; (2) rarely considering the correlation between labels assigned to each image. In this paper, we aim to resolve both of the problems simultaneously in a novel unified framework. Specifically, a proper distance metric is learned based on the structural SVM in a discriminative manner, which can optimize the ranking of the images induced by distances from a test image. Subsequently, a collaborative label propagation algorithm is leveraged to model the correlation between class labels in an explicit manner. Also, the learned metric is embedded in the propagation model. The integration of the two components leads to more accurate annotation results. The experiments conducted on the Corel dataset demonstrate the effectiveness of the proposed unified framework.

[1]  Gert R. G. Lanckriet,et al.  Metric Learning to Rank , 2010, ICML.

[2]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[3]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  David Grangier,et al.  A Discriminative Kernel-based Model to Rank Images from Text Queries , 2007 .

[5]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  Changhu Wang,et al.  Scalable search-based image annotation , 2008, Multimedia Systems.

[7]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[8]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[9]  Rong Jin,et al.  Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yong Wang,et al.  Coherent image annotation by learning semantic distance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[14]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[15]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[16]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.