Content-Based Analysis Improves Audiovisual Archive Retrieval

Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs and retrieval data already present in the audiovisual archive, and demonstrate that retrieval performance can be significantly improved when content-based methods are applied to search. To the best of our knowledge, this is the first time that the practice of an audiovisual archive has been taken into account for quantitative retrieval evaluation. To arrive at our main result, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches, content purchases, session information, and simulators to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. A detailed query-level analysis indicates that individual content-based retrieval methods such as transcript-based retrieval and concept-based retrieval yield approximately equal performance gains. When combined, we find that content-based video retrieval incorporated into the archive's practice results in significant performance increases for shot retrieval and for retrieving entire television programs. The time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.

[1]  Mathias Lux,et al.  ITEC-UNIKLU Known-Item Search Submission , 2010, TRECVID.

[2]  Karen Spärck Jones,et al.  Automatic content-based retrieval of broadcast news , 1995, MULTIMEDIA '95.

[3]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[4]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[5]  Jorma Laaksonen,et al.  PicSOM Experiments in TRECVID 2018 , 2015, TRECVID.

[6]  Martha Larson,et al.  Overview of VideoCLEF 2009: New Perspectives on Speech-based Multimedia Content Enrichment , 2009, CLEF.

[7]  Marcel Worring,et al.  VideOlympics: Real-Time Evaluation of Multimedia Retrieval Systems , 2008, IEEE MultiMedia.

[8]  Jun Yang,et al.  Finding Person X: Correlating Names with Visual Appearances , 2004, CIVR.

[9]  M. de Rijke,et al.  Using Coherence-Based Measures to Predict Query Difficulty , 2008, ECIR.

[10]  Maarten de Rijke,et al.  Today's and tomorrow's retrieval practice in the audiovisual archive , 2010, CIVR '10.

[11]  James Allan,et al.  Approaches to passage retrieval in full text information systems , 1993, SIGIR.

[12]  Marcel Worring,et al.  Balancing thread based navigation for targeted video search , 2008, CIVR '08.

[13]  Peter Wilkins,et al.  An investigation into weighted data fusion for content-based multimedia information retrieval , 2009 .

[14]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Yihong Gong,et al.  Lessons Learned from Building a Terabyte Digital Video Library , 1999, Computer.

[16]  Xian-Sheng Hua,et al.  Bayesian Visual Reranking , 2011, IEEE Transactions on Multimedia.

[17]  Maarten de Rijke,et al.  Exploiting redundancy in cross-channel video retrieval , 2007, MIR '07.

[18]  Rong Yan,et al.  How many high-level concepts will fill the semantic gap in news video retrieval? , 2007, CIVR '07.

[19]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[20]  Alexandre Allauzen,et al.  Open vocabulary ASR for audiovisual document indexation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[21]  Jun Yang,et al.  Exploring temporal consistency for video analysis and retrieval , 2006, MIR '06.

[22]  Shih-Fu Chang,et al.  Query-Adaptive Fusion for Multimodal Search , 2008, Proceedings of the IEEE.

[23]  Katja Hofmann,et al.  Validating Query Simulators: An Experiment Using Commercial Searches and Purchases , 2010, CLEF.

[24]  Maarten de Rijke,et al.  Search behavior of media professionals at an audiovisual archive: A transaction log analysis , 2010, J. Assoc. Inf. Sci. Technol..

[25]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[26]  Richard Wright,et al.  Broadcast Archives: Preserving the Future , 2001, ICHIM.

[27]  Laura Hollink,et al.  Search behavior of media professionals at an audiovisual archive: A transaction log analysis , 2010 .

[28]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[29]  Filip Radlinski,et al.  How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.

[30]  Franciska de Jong,et al.  Multimedia Search Without Visual Analysis: The Value of Linguistic and Contextual Information , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  Katja Hofmann,et al.  Comparing click-through data to purchase decisions for retrieval evaluation , 2010, SIGIR '10.

[32]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[33]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Christof Monz,et al.  The QMUL system description for IWSLT 2010 , 2010, IWSLT.

[35]  Meng Wang,et al.  MSRA atT TRECVID 2008: High-Level Feature Extraction and Automatic Search , 2008, TRECVID.

[36]  Jin Zhao,et al.  Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting , 2006, CIVR.

[37]  Hung-Khoon Tan,et al.  Fusing heterogeneous modalities for video and image re-ranking , 2011, ICMR '11.

[38]  Jean-Luc Gauvain,et al.  Modeling northern and southern varieties of dutch for STT , 2009, INTERSPEECH.

[39]  Christian Petersohn Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System , 2004, TRECVID.

[40]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[41]  Ellen M. Voorhees,et al.  The Philosophy of Information Retrieval Evaluation , 2001, CLEF.

[42]  Wei-Hao Lin,et al.  Assessing Effectiveness in Video Retrieval , 2005, CIVR.

[43]  Chong-Wah Ngo,et al.  Selection of Concept Detectors for Video Search by Ontology-Enriched Semantic Spaces , 2008, IEEE Transactions on Multimedia.

[44]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[45]  Maarten de Rijke,et al.  Shallow Morphological Analysis in Monolingual Information Retrieval for Dutch, German, and Italian , 2001, CLEF.

[46]  Yongwei Zhu,et al.  TRECVID 2010 Known-item Search (KIS) Task by I2R , 2010, TRECVID.

[47]  Ricardo Baeza-Yates,et al.  Improved query difficulty prediction for the web , 2008, CIKM '08.

[48]  Johan Oomen Accessing Audiovisual Heritage: A Roadmap for Collaborative Innovation , 2011, IEEE MultiMedia.

[49]  Ross Wilkinson,et al.  Effective retrieval of structured documents , 1994, SIGIR '94.

[50]  Thijs Westerveld,et al.  Using generative probabilistic models for multimedia retrieval , 2005, SIGF.

[51]  Franciska de Jong,et al.  Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[52]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[53]  Shih-Fu Chang,et al.  Visually Searching the Web for Content , 1997, IEEE Multim..

[54]  Shuicheng Yan,et al.  TRECVID 2010 Known-item Search by NUS , 2010, TRECVID.

[55]  Christos Diou,et al.  Reliability and effectiveness of clickthrough data for automatic image annotation , 2010, Multimedia Tools and Applications.

[56]  Arnold W. M. Smeulders,et al.  Visual-Concept Search Solved? , 2010, Computer.

[57]  Martha Larson,et al.  Multimodal indexing of digital audio-visual documents: A case study for cultural heritage data , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[58]  Rong Yan,et al.  A review of text and image retrieval approaches for broadcast news video , 2007, Information Retrieval.