Analyzing Image-Text Relations for Semantic Media Adaptation and Personalization

Progress in semantic media adaptation and personalisation requires that we know more about how different media types, such as texts and images, work together in multimedia communication. To this end, we present our ongoing investigation into image-text relations. Our idea is that the ways in which the meanings of images and texts relate in multimodal documents, such as web pages, can be classified on the basis of low-level media features and that this classification should be an early processing step in systems targeting semantic multimedia analysis. In this paper we present the first empirical evidence that humans can predict something about the main theme of a text from an accompanying image, and that this prediction can be emulated by a machine via analysis of low-level image features. We close by discussing how these findings could impact on applications for news adaptation and personalisation, and how they may generalise to other kinds of multimodal documents and to applications for semantic media retrieval, browsing, adaptation and creation.

[1]  Berthier A. Ribeiro-Neto,et al.  Image retrieval using multiple evidence ranking , 2004, IEEE Transactions on Knowledge and Data Engineering.

[2]  Seiji Yamada,et al.  Behavior-based web page evaluation , 2006, WWW.

[3]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[4]  Shih-Fu Chang,et al.  Visually Searching the Web for Content , 1997, IEEE Multim..

[5]  Virgílio A. F. Almeida,et al.  A community-aware search engine , 2004, WWW '04.

[6]  Keiji Yanai,et al.  Generic image classification using visual knowledge on the web , 2003, ACM Multimedia.

[7]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Linda G. Shapiro,et al.  Computer Vision , 2001 .

[9]  William I. Grosky,et al.  Narrowing the semantic gap - improved text-based web document retrieval using visual features , 2002, IEEE Trans. Multim..

[10]  Andrew Salway,et al.  Some Ideas for Modelling Image-Text Combinations , 2005 .

[11]  Jolly Gm Mark-recapture models with parameters constant in time. , 1982 .

[12]  Ilias Maglogiannis,et al.  Adapting user's browsing behavior and web evolution features for effective search in medical portals , 2006, 2006 First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06).

[13]  Clement T. Yu,et al.  Personalized Web search for improving retrieval effectiveness , 2004, IEEE Transactions on Knowledge and Data Engineering.

[14]  Stefan Winkler,et al.  A no-reference perceptual blur metric , 2002, Proceedings. International Conference on Image Processing.

[15]  Xing Xie,et al.  Adapting Web pages for small-screen devices , 2005, IEEE Internet Computing.

[16]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[17]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[18]  Wei-Ying Ma,et al.  Learning block importance models for web pages , 2004, WWW '04.

[19]  Yee Whye Teh,et al.  Names and faces in the news , 2004, CVPR 2004.

[20]  Masatoshi Yoshikawa,et al.  Adaptive web search based on user profile constructed without any effort from users , 2004, WWW '04.

[21]  Noel E. O'Connor,et al.  A hybrid technique for face detection in color images , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[22]  Andrew Salway,et al.  A system for image–text relations in new (and old) media , 2005 .

[23]  Wei-Ying Ma,et al.  Extracting Content Structure for Web Pages Based on Visual Representation , 2003, APWeb.

[24]  Hinrich Schütze,et al.  Personalized search , 2002, CACM.

[25]  Larry Fitzpatrick,et al.  Automatic feedback using past queries: social searching? , 1997, SIGIR '97.

[26]  G. Jolly EXPLICIT ESTIMATES FROM CAPTURE-RECAPTURE DATA WITH BOTH DEATH AND IMMIGRATION-STOCHASTIC MODEL. , 1965, Biometrika.