Which Information Sources are More Effective and Reliable in Video Search

It is common that users are interested in finding video segments, which contain further information about the video contents in a segment of interest. To facilitate users to find and browse related video contents, video hyperlinking aims at constructing links among video segments with relevant information in a large video collection. In this study, we explore the effectiveness of various video features on the performance of video hyperlinking, including subtitle, metadata, content features (i.e., audio and visual), surrounding context, as well as the combinations of those features. Besides, we also test different search strategies over different types of queries, which are categorized according to their video contents. Comprehensive experimental studies have been conducted on the dataset of TRECVID 2015 video hyperlinking task. Results show that (1) text features play a crucial role in search performance, and the combination of audio and visual features cannot provide improvements; (2) the consideration of contexts cannot obtain better results; and (3) due to the lack of training examples, machine learning techniques cannot improve the performance.

[1]  Maria Eskevich,et al.  The Search and Hyperlinking Task at MediaEval 2013 , 2013, MediaEval.

[2]  Paul Deléglise,et al.  Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks , 2014, LREC.

[3]  Martin Krulis,et al.  CUNI at MediaEval 2014 Search and Hyperlinking Task: Visual and Prosodic Features in Hyperlinking , 2014, MediaEval.

[4]  Tinne Tuytelaars,et al.  Beyond Metadata: Searching Your Archive Based on its Audio-visual Content , 2014 .

[5]  Gianni Amati,et al.  Frequentist and Bayesian Approach to Information Retrieval , 2006, ECIR.

[6]  Yi Yang,et al.  Searching Persuasively: Joint Event Detection and Evidence Recounting with Limited Supervision , 2015, ACM Multimedia.

[7]  Lori Lamel Multilingual Speech Processing Activities in Quaero: Application to Multimedia Search in Unstructured Data , 2012, Baltic HLT.

[8]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[9]  Martha Larson,et al.  Search and Hyperlinking Task at MediaEval 2012 , 2012, MediaEval.

[10]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Giorgio Gambosi,et al.  FUB, IASI-CNR and University of Tor Vergata at TREC 2008 Blog Track , 2008, TREC.

[12]  Bhiksha Raj,et al.  Beyond Gaussian Pyramid: Multi-skip Feature Stacking for action recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Maryam Habibi,et al.  Multimodal Reranking of Content-based Recommendations for Hyperlinking Video Snippets , 2014, ICMR.

[14]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[15]  Gianni Amati,et al.  Probability models for information retrieval based on divergence from randomness , 2003 .

[16]  Yi Yang,et al.  Content-Based Video Search over 1 Million Videos with 1 Core in 1 Second , 2015, ICMR.

[17]  Mark J. F. Gales,et al.  Automatic Transcription of Multi-genre Media Archives , 2013, SLAM@INTERSPEECH.

[18]  Hang Li,et al.  A Short Introduction to Learning to Rank , 2011, IEICE Trans. Inf. Syst..

[19]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[20]  Rik Van de Walle,et al.  Multimedia information seeking through search and hyperlinking , 2013, ICMR.