Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up approaches

Semantic representation of multimedia information is vital for enabling the kind of multimedia search capabilities that professional searchers require. Manual annotation is often not possible because of the shear scale of the multimedia information that needs indexing. This paper explores the ways in which we are using both top-down, ontologically driven approaches and bottom-up, automatic-annotation approaches to provide retrieval facilities to users. We also discuss many of the current techniques that we are investigating to combine these top-down and bottom-up approaches.

[1]  Antonio Torralba,et al.  Scene-Centered Description from Spatial Envelope Properties , 2002, Biologically Motivated Computer Vision.

[2]  Paul H. Lewis,et al.  Knowledge-Based Exploration of Multimedia Museum Collections , 2004, EWIMT.

[3]  Stefan M. Rüger,et al.  Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation , 2005, CIVR.

[4]  Paul H. Lewis,et al.  SCULPTEUR: Towards a New Paradigm for Multimedia Museum Information Handling , 2003, SEMWEB.

[5]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[6]  R. Manmatha,et al.  Using Maximum Entropy for Automatic Image Annotation , 2004, CIVR.

[7]  Y. Mori,et al.  Image-to-word transformation based on dividing and vector quantizing images with words , 1999 .

[8]  Jonathon S. Hare Saliency for image description and retrieval , 2006 .

[9]  Jane Hunter,et al.  Adding Multimedia to the Semantic Web: Building an MPEG-7 ontology , 2001, SWWS.

[10]  Susanne Ornager Image Retrieval: Theoretical Analysis and Empirical User Studies on Accessing Information in Images. , 1997 .

[11]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Paul H. Lewis,et al.  Automatic Annotation of Images from the Practitioner Perspective , 2005, CIVR.

[13]  Marcel Worring,et al.  Classification of user image descriptions , 2004, Int. J. Hum. Comput. Stud..

[14]  Peter G. B. Enser,et al.  Analysis of user need in image archives , 1997, J. Inf. Sci..

[15]  William I. Grosky,et al.  From features to semantics: some preliminary results , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[16]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Susan T. Dumais,et al.  O'brien. using linear algebra for intelligent information retrieval. technical report ut-cs-94-270 , 1994 .

[18]  Bo Hu,et al.  Ontology-based medical image annotation with description logics , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[19]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[20]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[21]  Peter G. B. Enser Pictorial information retrieval , 1995 .

[22]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[23]  Bob J. Wielinga,et al.  Ontology-Based Photo Annotation , 2001, IEEE Intell. Syst..

[24]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[25]  William I. Grosky,et al.  Negotiating the semantic gap: from feature maps to semantic landscapes , 2001, Pattern Recognit..

[26]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[27]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[28]  Bo Hu,et al.  Multimedia Distributed Knowledge Management in MIAKT , 2004, SemAnnot@ISWC.

[29]  Daniel Gatica-Perez,et al.  On image auto-annotation with latent space models , 2003, ACM Multimedia.

[30]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[31]  Paul H. Lewis,et al.  Surveying the Reality of Semantic Image Retrieval , 2005, VISUAL.

[32]  mc schraefel,et al.  mSpace: interaction design for user-determined, adaptable domain exploration in hypermedia , 2003 .

[33]  S. Sclaroff,et al.  Combining textual and visual cues for content-based image retrieval on the World Wide Web , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[34]  George A. Miller,et al.  WordNet: A Lexical Database for the English Language , 2002 .

[35]  Seth Pettie,et al.  Mind the gap , 2006, Nature Reviews Drug Discovery.

[36]  Chrisa Tsinaraki,et al.  Coupling OWL with MPEG-7 and TV-Anytime for Domain-specific Multimedia Information Integration and Retrieval , 2004, RIAO.

[37]  R. Manmatha,et al.  An Inference Network Approach to Image Retrieval , 2004, CIVR.

[38]  M. G. Strintzis,et al.  INTEGRATING KNOWLEDGE , SEMANTICS AND CONTENT FOR USER-CENTRED INTELLIGENT MEDIA SERVICES : THE ACEMEDIA PROJECT , 2004 .