CUNI at TRECVID 2015: Video Hyperlinking Task

In this paper, we present our approach used in the TRECVID 2015 Video Hyperlinking Task [13]. Our approach combines text-based similarity calculated on subtitles, visual similarity between keyframes calculated using Feature Signatures, and preference whether the query and retrieved answer come from the same TV series. All experiments were tuned and tested on about 2500 hours of BBC TV programmes. Our Baseline run exploits fixed-length segmentation, text-based retrieval of subtitles, and query expansion which utilizes metadata, context, information about music and artist contained in the query segment and visual concepts. The Series run combines the Baseline run with weighting based on information whether the query and data segment come from the same TV series. The FS run combines the Baseline run with the similarity between query and data keyframes calculated using Feature Signatures. The FSSeriesRerank run is based on the FS run on which we applied reranking which, again, uses information about the TV series. The Series run significantly outperforms the FSSeriesRerank run. Both these runs are significantly inferior to our Baseline run in terms of all our reported measures. The FS run outperforms the Baseline run in terms of all measures but it is significantly better than the Baseline run only in terms of the MAP score. Our test results confirm that employment of visual similarity can improve video retrieval based on information contained in subtitles but information about TV series which was most helpful in our training experiments did not lead to further improvements.

[1]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[2]  Thomas Seidl,et al.  Indexing the signature quadratic form distance for efficient content-based multimedia retrieval , 2011, ICMR.

[3]  Martin Krulis,et al.  Efficient Extraction of Feature Signatures Using Multi-GPU Architecture , 2013, MMM.

[4]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[5]  Carlo Tomasi,et al.  Perceptual metrics for image database navigation , 1999 .

[6]  Ben He,et al.  Terrier : A High Performance and Scalable Information Retrieval Platform , 2022 .

[7]  Martin Krulis,et al.  Combining CPU and GPU architectures for fast similarity search , 2012, Distributed and Parallel Databases.

[8]  Maria Eskevich,et al.  The Search and Hyperlinking Task at MediaEval 2013 , 2013, MediaEval.

[9]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[10]  Maria Eskevich,et al.  Adapting Binary Information Retrieval Evaluation Metrics for Segment-based Retrieval Tasks , 2013, ArXiv.

[11]  Pavel Pecina,et al.  Audio Information for Hyperlinking of TV Content , 2015, SLAM@ACM Multimedia.

[12]  Martin Krulis,et al.  CUNI at MediaEval 2014 Search and Hyperlinking Task: Visual and Prosodic Features in Hyperlinking , 2014, MediaEval.

[13]  Jakub Lokoc,et al.  Ptolemaic access methods: Challenging the reign of the metric space model , 2013, Inf. Syst..

[14]  Pedro Cano,et al.  A review of algorithms for audio fingerprinting , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[15]  Pavel Zezula,et al.  DISA at ImageCLEF 2014: The Search-based Solution for Scalable Image Annotation , 2014, CLEF.