Cross-Media Entity Recognition in Nearly Parallel Visual and Textual Documents

We present a novel approach to automatically annotate images solely using associated text. We detect and classify all entities (persons and objects) in the text after which we determine the salience (the importance of an entity in a text) and visualness (the extent to which an entity can be perceived visually) of these entities. We combine these measures to compute the probability that an entity is present in the image. The suitability of our approach was successfully tested on 900 image-text pairs of Yahoo! News.

[1]  Andrei Mikheev,et al.  Automatic Rule Induction for Unknown-Word Guessing , 1997, CL.

[2]  Hironobu Takahashi,et al.  Automatic word assignment to images based on image division and vector quantization , 2000 .

[3]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[4]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[5]  Marie-Francine Moens,et al.  Using patterns of thematic progression for building a table of contents of a text , 2008, Natural Language Engineering.

[6]  Marie-Francine Moens,et al.  Measuring Aboutness of an Entity in a Text , 2006 .

[7]  Thijs Westerveld,et al.  Image Retrieval: Content versus Context , 2000, RIAO.

[8]  B. Hayes THE WEB OF WORDS , 1999 .

[9]  Takeo Kanade,et al.  Name-It: Naming and Detecting Faces in News Videos , 1999, IEEE Multim..

[10]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[11]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[12]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[13]  Alexander C. Berg,et al.  Who's In the Picture , 2004, NIPS 2004.

[14]  J. Kamps,et al.  Words with attitude , 2002 .

[15]  Christiane Fellbaum,et al.  Building Semantic Concordances , 1998 .

[16]  Marie-Francine Moens,et al.  Generic technologies for single- and multi-document summarization , 2005, Inf. Process. Manag..

[17]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[19]  Djoerd Hiemstra,et al.  An Integrated Approach to Text and Image Retrieval- The Lowlands Team at Trecvid 2005 , 2005, TRECVID.