Semantic Video Indexing

Query-by-keyword is the paradigm on which machine-based text search is still based. Elaborating on the success of text-based search engines, query-by-keyword also gains momentum in multimedia retrieval. For multimedia archives it is hard to achieve access, however, when based on text alone. Multimodal indexing is essential for effective access to video archives. For the automatic detection of specific concepts, the state-of-the-art has produced sophisticated and specialized indexing methods. Other than their textual counterparts, generic methods for semantic indexing in multimedia are neither generally available, nor scalable in their computational needs, nor robust in their performance. As a consequence, semantic access to multimedia archives is still limited. Therefore, there is a case to be made for a new approach to semantic video indexing.

[1]  Arnold W. M. Smeulders,et al.  Color Invariance , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[3]  Merriam-Webster,et al.  The Merriam Webster Dictionary , 1983 .

[4]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Paul Over,et al.  TRECVID: evaluating the effectiveness of information retrieval tasks on digital video , 2004, MULTIMEDIA '04.

[6]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[7]  John R. Smith,et al.  Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues , 2003, EURASIP J. Adv. Signal Process..

[8]  Milind R. Naphade,et al.  Extracting semantics from audio-visual content: the final frontier in multimedia retrieval , 2002, IEEE Trans. Neural Networks.

[9]  R. Brunelli,et al.  A Survey on the Automatic Indexing of Video Data, , 1999, J. Vis. Commun. Image Represent..

[10]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[11]  Marcel Worring,et al.  Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[12]  Boon-Lock Yeo,et al.  Video query: Research directions , 1998, IBM J. Res. Dev..

[13]  Ramesh C. Jain,et al.  Video Data Management Systems: Metadata and Architecture , 1998, Multimedia Data Management.

[14]  Zhu Liu,et al.  Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[15]  Marcel Worring,et al.  User transparent parallel processing of the 2004 NIST TRECVID data set , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[16]  Shih-Fu Chang,et al.  Visually Searching the Web for Content , 1997, IEEE Multim..

[17]  Marcel Worring,et al.  Multimedia event-based video indexing using time intervals , 2005, IEEE Transactions on Multimedia.

[18]  Noboru Babaguchi,et al.  Event based indexing of broadcasted sports video by intermodal collaboration , 2002, IEEE Trans. Multim..

[19]  Anil K. Jain,et al.  Automatic classification of tennis video for high-level content-based retrieval , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[20]  Yihong Gong,et al.  Lessons Learned from Building a Terabyte Digital Video Library , 1999, Computer.

[21]  Peter M. A. Sloot,et al.  The distributed ASCI Supercomputer project , 2000, OPSR.

[22]  Cees G. M. Snoek The authoring metaphor to machine understanding of multimedia , 2001 .

[23]  Marcel Worring,et al.  Learning rich semantics from news video archives by style analysis , 2006, TOMCCAP.

[24]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[25]  Ali N. Akansu,et al.  Multi-Modal Dialog Scene Detection Using Hidden Markov Models for Content-Based Multimedia Indexing , 2001, Multimedia Tools and Applications.

[26]  Ching-Yung Lin,et al.  Video Collaborative Annotation Forum: Establishing Ground-Truth Labels on Large Multimedia Datasets , 2003, TRECVID.

[27]  Joseph M. Boggs The Art of Watching Films , 1978 .

[28]  Milind R. Naphade On supervision and statistical learning for semantic multimedia analysis , 2004, J. Vis. Commun. Image Represent..

[29]  Paul Over,et al.  The TREC VIdeo Retrieval Evaluation (TRECVID): A Case Study and Status Report , 2004, RIAO.

[30]  Jianping Fan,et al.  ClassView: hierarchical video shot classification, indexing, and accessing , 2004, IEEE Transactions on Multimedia.

[31]  Takeo Kanade,et al.  Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.

[32]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[33]  Marcel Worring,et al.  MediaMill: exploring news video archives based on learned semantics , 2005, MULTIMEDIA '05.

[34]  Arun Hampapur Designing video data management systems , 1995 .

[35]  Takeo Kanade,et al.  Video OCR: indexing digital news libraries by recognition of superimposed captions , 1999, Multimedia Systems.

[36]  Sanjeev R. Kulkarni,et al.  Automated analysis and annotation of basketball video , 1997, Electronic Imaging.

[37]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Djoerd Hiemstra,et al.  Lazy Users and Automatic Video Retrieval Tools in (the) Lowlands , 2001, TREC.

[39]  Yihong Gong,et al.  Automatic parsing of news video , 1994, 1994 Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[40]  Yihong Gong,et al.  Automatic parsing and indexing of news video , 1995, Multimedia Systems.

[41]  Marcel Worring,et al.  The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  David Bordwell,et al.  Film Art: An Introduction , 1979 .

[43]  Alexander G. Hauptmann,et al.  Towards a Large Scale Concept Ontology for Broadcast Video , 2004, CIVR.

[44]  P. Bartlett,et al.  Probabilities for SV Machines , 2000 .

[45]  M. Ibrahim Sezan,et al.  A semantic event-detection approach and its application to detecting hunts in wildlife vide , 2000, IEEE Trans. Circuits Syst. Video Technol..

[46]  Stephen S. Intille Tracking using a local closed-world assumption : tracking in the football domain , 1994 .

[47]  Tobun Dorbin Ng,et al.  Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video , 2003, TRECVID.

[48]  Georges Quénot,et al.  CLIPS at TREC 11: Experiments in Video Retrieval , 2002, TREC.

[49]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.