Evaluation of Automatic Tag Sense Disambiguation Using the MIRFLICKR Image Collection

Automatic identification of intended tag meanings is a challenge in large image collections where human authors assign tags inspired by emotional or professional motivations. Algorithms for automatic tag disambiguation need “golden” collections of manually created tags to establish baselines for accuracy assessment. Here we show how to use the MIRFLICKR-25000 collection to evaluate the performance of our algorithm for tag sense disambiguation which identifies meanings of image tags based on WordNet or Wikipedia. We present three different types of observations on the disambiguated tags: (i) accuracy evaluation, (ii) evaluation of the semantic similarity of the individual tags with the image category and (iii) the semantic similarity of an image tagset to the image category, using different word embedding models for the latter two. We show how word embeddings create a specific baseline so the results can be compared. The accuracy we achieve is 78.6%.

[1]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[2]  Ekaterina Saenko,et al.  Image sense disambiguation: a multimodal approach , 2009 .

[3]  Gabriele Gianini,et al.  Selecting Feature-Words in Tag Sense Disambiguation Based on Their Shapley Value , 2016, 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS).

[4]  Sven J. Dickinson,et al.  Unsupervised Disambiguation of Image Captions , 2012, *SEM@NAACL-HLT.

[5]  Olivier Raiman,et al.  DeepType: Multilingual Entity Linking by Neural Type System Evolution , 2018, AAAI.

[6]  Trevor Darrell,et al.  Unsupervised Learning of Visual Sense Models for Polysemous Words , 2008, NIPS.

[7]  Hyoung-Joo Kim,et al.  Tag Sense Disambiguation for Clarifying the Vocabulary of Social Tags , 2009, 2009 International Conference on Computational Science and Engineering.

[8]  Kiril Ivanov Simov,et al.  Comparison of Word Embeddings from Different Knowledge Graphs , 2017, LDK.

[9]  Alexander Popov,et al.  Word Sense Disambiguation with Recurrent Neural Networks , 2017, RANLP 2017.

[10]  José Camacho-Collados,et al.  From Word to Sense Embeddings: A Survey on Vector Representations of Meaning , 2018, J. Artif. Intell. Res..

[11]  Galia Angelova,et al.  About Sense Disambiguation of Image Tags in Large Annotated Image Collections , 2016 .

[12]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[14]  Francis Ferraro,et al.  A Survey of Current Datasets for Vision and Language Research , 2015, EMNLP.

[15]  Nicolas James,et al.  Towards Semantic Image Annotation With Keyword Disambiguation Using Semantic And Visual Knowledge , 2009 .