论文信息 - Representing Topics Using Images

Representing Topics Using Images

Topics generated automatically, e.g. using LDA, are now widely used in Computational Linguistics. Topics are normally represented as a set of keywords, often the n terms in a topic with the highest marginal probabilities. We introduce an alternative approach in which topics are represented using images. Candidate images for each topic are retrieved from the web by querying a search engine using the top n terms. The most suitable image is selected from this set using a graph-based algorithm which makes use of textual information from the metadata associated with each image and features extracted from the images themselves. We show that the proposed approach significantly outperforms several baselines and can provide images that are useful to represent a topic.

Mark Stevenson | Nikolaos Aletras

[1] Koen E. A. van de Sande,et al. Evaluation of color descriptors for object and scene recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Padhraic Smyth,et al. TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling , 2012, TIST.

[3] Xiaojin Zhu,et al. A Topic Model for Word Sense Disambiguation , 2007, EMNLP.

[4] Yansong Feng,et al. How Many Words Is a Picture Worth? Automatic Caption Generation for News Images , 2010, ACL.

[5] Fabio Stella,et al. Automatic Labeling of Topics , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[6] Yansong Feng,et al. Topic Models for Image Annotation and Text Illustration , 2010, HLT-NAACL.

[7] Mark Stevenson,et al. Evaluating Topic Coherence Using Distributional Semantics , 2013, IWCS.

[8] Sreenivas Gollapudi,et al. Enriching textbooks with images , 2011, CIKM '11.

[9] Gabriella Kazai,et al. In Search of Quality in Crowdsourcing for Search Engine Evaluation , 2011, ECIR.

[10] Hongfei Yan,et al. Automatic labeling hierarchical topics , 2012, CIKM '12.

[11] Michael E. Lesk,et al. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.