An Overview of Multimodal Video Representation for Semantic Analysis

This paper gives an overview of approaches to video representation targeting semantic analysis for content-based indexing and retrieval. It highlights the major achievements of the existing methodologies and sheds new light to the challenges that are still unsolved. The problem of adaptive representation of digital multimedia is critically assessed and some novel ideas are presented. In addition, the concept of video multimodality is reevaluated and redefined in order to introduce the modalities like editing technique. An extensive literature survey on the topics involved is given.

[1]  F. Saussure,et al.  Course in General Linguistics , 1960 .

[2]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[3]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[4]  Janko Calic,et al.  Spatial analysis in key-frame extraction using video segmentation , 2004 .

[5]  S. Eisenstein,et al.  Film Form: Essays in Film Theory , 1949 .

[6]  Janko Calic,et al.  Automated Visual Recognition of Individual African Penguins , 2004 .

[7]  John S. Boreczky,et al.  Comparison of video shot boundary detection techniques , 1996, J. Electronic Imaging.

[8]  Soren Kjorup,et al.  Kuleshov on Film: Writings by Lev Kuleshov , 1975 .

[9]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Riccardo Leonardi,et al.  Semantic Indexing of Multimedia Documents , 2002, IEEE Multim..

[11]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[12]  Shih-Fu Chang,et al.  Tools and techniques for color image retrieval , 1996, Electronic Imaging.

[13]  Simone Santini,et al.  Emergent Semantics through Interaction in Image Databases , 2001, IEEE Trans. Knowl. Data Eng..

[14]  Moncef Gabbouj,et al.  MUVIS: a content-based multimedia indexing and retrieval framework , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[15]  Glorianna Davenport,et al.  Cinematic primitives for multimedia , 1991, IEEE Computer Graphics and Applications.

[16]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[17]  Janko Calic,et al.  Tracking Animals in Wildlife Videos Using Face Detection , 2004, EWIMT.

[18]  Thomas S. Huang,et al.  Factor graph framework for semantic video indexing , 2002, IEEE Trans. Circuits Syst. Video Technol..

[19]  Irena Koprinska,et al.  Temporal video segmentation: A survey , 2001, Signal Process. Image Commun..

[20]  Sion Hannuna,et al.  SEGMENTING QUADRUPED GAIT PATTERNS FROM WILDLIFE VIDEO , 2005 .

[21]  Steffen Staab,et al.  Semantic Annotation of Images and Videos for Multimedia Analysis , 2005, ESWC.

[22]  L. Manovich,et al.  The language of new media , 2001 .

[23]  Joëlle Coutaz,et al.  A design space for multimodal systems: concurrent processing and data fusion , 1993, INTERCHI.

[24]  Riccardo Leonardi,et al.  Indexing audiovisual databases through joint audio and video processing , 1998, Int. J. Imaging Syst. Technol..

[25]  Chitra Dorai,et al.  Bridging the semantic gap with computational media aesthetics , 2003, IEEE MultiMedia.

[26]  Brett Adams Where does computational media aesthetics fit , 2003 .

[27]  Alberto Del Bimbo,et al.  Highlights modeling and detection in sports videos , 2004, Pattern Analysis and Applications.

[28]  FRANK NACK,et al.  Toward the Automated Editing of Theme Oriented Video Sequences , 1997, Appl. Artif. Intell..

[29]  C. Metz Film Language: A Semiotics of the Cinema , 1974 .

[30]  Marcel Worring,et al.  Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[31]  Arif Ghafoor,et al.  Semantic Modeling and Knowledge Representation in Multimedia Databases , 1999, IEEE Trans. Knowl. Data Eng..

[32]  Majid Mirmehdi,et al.  ICBR - Multimedia Management System for Intelligent Content Based Retrieval , 2004, CIVR.

[33]  Ishwar K. Sethi,et al.  Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.

[34]  Marcel Worring,et al.  Systematic evaluation of logical story unit segmentation , 2002, IEEE Trans. Multim..

[35]  Shih-Fu Chang,et al.  Image and video search engine for the World Wide Web , 1997, Electronic Imaging.

[36]  Svetha Venkatesh,et al.  Media computing : computational media aesthetics , 2002 .

[37]  Nevenka Dimitrova Context and Memory in Multimedia Content Analysis , 2004, IEEE Multim..

[38]  Ferdinand de Saussure Course in General Linguistics , 1916 .

[39]  Steffen Staab Emergent Semantics , 2002, IEEE Intell. Syst..

[40]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[41]  Christian Metz,et al.  Essais sur la signification au cinéma , 2013 .

[42]  Riccardo Leonardi,et al.  Semantic indexing of soccer audio-visual sequences: a multimodal approach based on controlled Markov chains , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[43]  Ullas Gargi,et al.  Performance characterization of video-shot-change detection methods , 2000, IEEE Trans. Circuits Syst. Video Technol..

[44]  Michael G. Strintzis,et al.  THE SCHEMA REFERENCE SYSTEM : AN EXTENSIBLE MODULAR SYSTEM FOR CONTENT-BASED INFORMATION RETRIEVAL , 2005 .

[45]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[46]  Atsuo Yoshitaka,et al.  A Survey on Content-Based Retrieval for Multimedia Databases , 1999, IEEE Trans. Knowl. Data Eng..

[47]  Yiannis Kompatsiaris,et al.  Achieving Integration of Knowledge and Content Technologies: The aceMedia Project , 2004, EWIMT.

[48]  Milind R. Naphade On supervision and statistical learning for semantic multimedia analysis , 2004, J. Vis. Commun. Image Represent..

[49]  Marc Davis,et al.  Media streams: representing video for retrieval and repurposing , 1994, MULTIMEDIA '94.

[50]  Zhu Liu,et al.  Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[51]  Neill W. Campbell,et al.  Iterative refinement by relevance feedback in content-based digital image retrieval , 1998, MULTIMEDIA '98.

[52]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[53]  Janko Calic,et al.  A rule-based video annotation system , 2004, IEEE Transactions on Circuits and Systems for Video Technology.