On the detection of semantic concepts at TRECVID

Semantic multimedia management is necessary for the effective and widespread utilization of multimedia repositories and realizing the potential that lies untapped in the rich multimodal information content. This challenge has driven researchers to devise new algorithms and systems that enable automatic or semi-automatic tagging of large scale multimedia content with rich semantics. An emerging research area is the detection of a predetermined set of semantic concepts that can act as semantic filters and aid in search, and manipulation. The NIST TRECVID benchmark has responded by creating a task that has evaluated the performance of concept detection. Within the scope of this benchmark task, this paper studies trends in the emerging concept detection systems, architectures and algorithms. It also analyzes strategies that have yielded reasonable success, and challenges and gaps that lie ahead.

[1]  Howard D. Wactlar,et al.  Indexing and search of multimodal information , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[3]  Anil K. Jain,et al.  On image classification: city images vs. landscapes , 1998, Pattern Recognit..

[4]  Paul Over,et al.  The TREC-2002 Video Track Report , 2002, TREC.

[5]  Joshua R. Smith,et al.  INTEGRATING FEATURES , MODELS , AND SEMANTICS FOR CONTENT-BASED RETRIEVAL , 2001 .

[6]  Ellen M. Voorhees,et al.  The Philosophy of Information Retrieval Evaluation , 2001, CLEF.

[7]  John R. Smith,et al.  Modeling semantic concepts to support query by keywords in video , 2002, Proceedings. International Conference on Image Processing.

[8]  Haim H. Permuter,et al.  IBM Research TREC 2002 Video Retrieval System , 2002, TREC.

[9]  Yanjun Qi,et al.  Video Classification and Retrieval with the Informedia Digital Video Library System , 2002, TREC.

[10]  Junyu Niu,et al.  FDU at TREC 2002: Filtering, Q&A, Web and Video Tasks , 2002, TREC.

[11]  Alan F. Smeaton,et al.  Dublin City University Video Track Experiments for TREC 2002 , 2001, TREC.

[12]  Georges Quénot,et al.  CLIPS at TREC 11: Experiments in Video Retrieval , 2002, TREC.

[13]  Marcel Worring,et al.  TREC Feature Extraction by Active Learning , 2002, TREC.

[14]  Fabrice Souvannavong,et al.  Semantic Feature Extraction using Mpeg Macro-block Classification , 2002, TREC.

[15]  Timo Ojala,et al.  TRECVID 2003 Experiments at Media Team Oulu and VTT , 2003, TRECVID.

[16]  Fabrice Souvannavong,et al.  Latent Semantic Indexing for Video Content Modeling and Analysis , 2003, TRECVID.

[17]  Omar Javed,et al.  University of Central Florida at TRECVID 2004 , 2003, TRECVID.

[18]  Tobun Dorbin Ng,et al.  Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video , 2003, TRECVID.

[19]  Xin Huang,et al.  Shot Boundary Detection and High-Level Features Extraction for the TREC Video Evaluation 2003 , 2003, TRECVID.

[20]  Ching-Yung Lin,et al.  Video Collaborative Annotation Forum: Establishing Ground-Truth Labels on Large Multimedia Datasets , 2003, TRECVID.

[21]  John R. Smith,et al.  A Hybrid Framework for Detecting the Semantics of Concepts and Context , 2003, CIVR.