Visual Descriptors in Methods for Video Hyperlinking

In this paper, we survey different state-of-the-art visual processing methods and utilize them in hyperlinking. Visual information, calculated using Features Signatures, SIMILE descriptors and convolutional neural networks (CNN), is utilized as similarity between video frames and used to find similar faces, objects and setting. Visual concepts in frames are also automatically recognized and textual output of the recognition is combined with search based on subtitles and transcripts. All presented experiments were performed in the Search and Hyperlinking 2014 MediaEval task and Video Hyperlinking 2015 TRECVid task.

[1]  Pavel Zezula,et al.  Evaluation Platform for Content-Based Image Retrieval Systems , 2011, TPDL.

[2]  Andrew Zisserman,et al.  On-the-fly learning for visual search of large-scale image and video datasets , 2015, International Journal of Multimedia Information Retrieval.

[3]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[4]  Tinne Tuytelaars,et al.  Beyond Metadata: Searching Your Archive Based on its Audio-visual Content , 2014 .

[5]  Alexander G. Hauptmann,et al.  CMU-SMU@TRECVID 2015: Video Hyperlinking , 2015 .

[6]  Neha Jain,et al.  A Unified, Modular and Multimodal Approach to Search and Hyperlinking Video , 2013, MediaEval.

[7]  Noel E. O'Connor,et al.  DCU Linking Runs at MediaEval 2012 Search and Hyperlinking Task , 2012, MediaEval.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Wesley De Neve,et al.  Ghent University-iMinds at MediaEval 2013: An Unsupervised Named Entity-based Similarity Measure for Search and Hyperlinking , 2013, MediaEval.

[10]  Gina-Anne Levow UWCL at MediaEval 2013: Similar Segments in Social Speech Task , 2013, MediaEval.

[11]  Pavel Pecina,et al.  Audio Information for Hyperlinking of TV Content , 2015, SLAM@ACM Multimedia.

[12]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[13]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[14]  David Novak,et al.  Large-scale Image Retrieval using Neural Net Descriptors , 2015, SIGIR.

[15]  Paul Deléglise,et al.  Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks , 2014, LREC.

[16]  Maria Eskevich,et al.  The Search and Hyperlinking Task at MediaEval 2013 , 2013, MediaEval.

[17]  Michal Batko,et al.  CUNI at TRECVID 2015: Video Hyperlinking Task , 2015, TRECVID.

[18]  Martin Krulis,et al.  CUNI at MediaEval 2014 Search and Hyperlinking Task: Visual and Prosodic Features in Hyperlinking , 2014, MediaEval.

[19]  Martin Krulis,et al.  Efficient extraction of clustering-based feature signatures using GPU architectures , 2015, Multimedia Tools and Applications.

[20]  Werner Bailer,et al.  TOSCA-MP at Search and Hyperlinking of Television Content Task , 2013, MediaEval.

[21]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[22]  Guillaume Gravier,et al.  IRISA at TrecVid 2015: Leveraging Multimodal LDA for Video Hyperlinking , 2015, TRECVID.

[23]  Quoc-Minh Bui,et al.  LinkedTV at MediaEval 2014 Search and Hyperlinking Task , 2014, MediaEval.

[24]  Pavel Zezula,et al.  DISA at ImageCLEF 2014: The Search-based Solution for Scalable Image Annotation , 2014, CLEF.

[25]  Zsombor Paroczi,et al.  DCLab at MediaEval2014 Search and Hyperlinking Task , 2014, MediaEval.

[26]  Martin Krulis,et al.  Combining CPU and GPU architectures for fast similarity search , 2012, Distributed and Parallel Databases.

[27]  Maria Eskevich,et al.  Adapting Binary Information Retrieval Evaluation Metrics for Segment-based Retrieval Tasks , 2013, ArXiv.

[28]  Emilio Sanchis Arnal,et al.  ELiRF at MediaEval 2013: Similar Segments in Social Speech Task , 2013, MediaEval.

[29]  Luca Rossetto,et al.  Interactive video search tools: a detailed analysis of the video browser showdown 2015 , 2016, Multimedia Tools and Applications.

[30]  Thomas Seidl,et al.  Signature Quadratic Form Distance , 2010, CIVR '10.

[31]  Jakub Lokoc,et al.  Video Retrieval with Feature Signature Sketches , 2014, SISAP.

[32]  Jean-Luc Gauvain,et al.  Speech Processing for Audio Indexing , 2008, GoTAL.

[33]  Chong-Wah Ngo,et al.  VIREO @ TRECVID 2015: Video Hyperlinking (LNK) , 2015 .

[34]  Maryam Habibi,et al.  Multimodal Reranking of Content-based Recommendations for Hyperlinking Video Snippets , 2014, ICMR.

[35]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[36]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[37]  Usman Niaz,et al.  EURECOM at TrecVid 2015: Semantic indexing and video hyperlinking tasks , 2015 .

[38]  Steven D. Werner,et al.  Evaluating Prosody-Based Similarity Models for Information Retrieval , 2013, MediaEval.

[39]  Liting Zhou,et al.  DCU ADAPT @ TRECVid 2015: Video Hyperlinking Task , 2015, TRECVID.

[40]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[41]  Maria Eskevich,et al.  Defining and Evaluating Video Hyperlinking for Navigating Multimedia Archives , 2015, WWW.

[42]  Carlo Tomasi,et al.  Perceptual metrics for image database navigation , 1999 .

[43]  Mark J. F. Gales,et al.  Automatic Transcription of Multi-genre Media Archives , 2013, SLAM@INTERSPEECH.

[44]  Ben He,et al.  Terrier : A High Performance and Scalable Information Retrieval Platform , 2022 .

[45]  Jakub Lokoc,et al.  Enhanced Signature-Based Video Browser , 2015, MMM.