Two-layers re-ranking approach based on contextual information for visual concepts detection in videos

Context helps to understand the meaning of a word and allows the disambiguation of polysemic terms. Many researches took advantage of this notion in information retrieval. For concept-based video indexing and retrieval, this idea seems a priori valid. One of the major problems is then to provide a definition of the context and to choose the most appropriate methods for using it. Two kinds of contexts were exploited in the past to improve concepts detection: in some works, inter-concepts relations are used as semantic context, where other approaches use the temporal features of videos to improve concepts detection. Results of these works showed that the “temporal” and the “semantic” contexts can improve concept detection. In this work we use the semantic context through an ontology and exploit the efficiency of the temporal context in a “two-layers” re-ranking approach. Experiments conducted on TRECVID 2010 data show that the proposed approach always improves over initial results obtained using either MSVM or KNN classifiers or their late fusion, achieving relative gains between 9% and 33% of the MAP measure.

[1]  Georges Quénot,et al.  Re-ranking by local re-scoring for video indexing and retrieval , 2011, CIKM '11.

[2]  David Dagan Feng,et al.  Improving News Video Annotation with Semantic Context , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[3]  Xiangyang Xue,et al.  Semantic video indexing by fusing explicit and implicit context spaces , 2010, ACM Multimedia.

[4]  Mubarak Shah,et al.  Improving Semantic Concept Detection and Retrieval using Contextual Estimates , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[5]  Georges Quénot,et al.  Evaluations of multi-learner approaches for concept indexing in video documents , 2010, RIAO.

[6]  Jun Yang,et al.  (Un)Reliability of video concept detection , 2008, CIVR '08.

[7]  T.S. Huang,et al.  Recognizing high-level audio-visual concepts using context , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[8]  Yi Wu,et al.  Ontology-based multi-classification learning for video concept detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[9]  Milind R. Naphade,et al.  Detecting semantic concepts using context and audiovisual features , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[10]  Yung-Yu Chuang,et al.  Multi-cue fusion for semantic video indexing , 2008, ACM Multimedia.

[11]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[12]  Stéphane Ayache,et al.  Video Corpus Annotation Using Active Learning , 2008, ECIR.