Multimedia information seeking through search and hyperlinking

Searching for relevant webpages and following hyperlinks to related content is a widely accepted and effective approach to information seeking on the textual web. Existing work on multimedia information retrieval has focused on search for individual relevant items or on content linking without specific attention to search results. We describe our research exploring integrated multimodal search and hyperlinking for multimedia data. Our investigation is based on the MediaEval 2012 Search and Hyperlinking task. This includes a known-item search task using the Blip10000 internet video collection, where automatically created hyperlinks link each relevant item to related items within the collection. The search test queries and link assessment for this task was generated using the Amazon Mechanical Turk crowdsourcing platform. Our investigation examines a range of alternative methods which seek to address the challenges of search and hyperlinking using multimodal approaches. The results of our experiments are used to propose a research agenda for developing effective techniques for search and hyperlinking of multimedia content.

[1]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[2]  Mohammad Soleymani,et al.  The Community and the Crowd: Multimedia Benchmark Dataset Development , 2012, IEEE MultiMedia.

[3]  Gobinda G. Chowdhury,et al.  TREC: Experiment and Evaluation in Information Retrieval , 2007 .

[4]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[5]  Martha Larson,et al.  Creating a Data Collection for Evaluating Rich Speech Retrieval , 2012, LREC.

[6]  Hitoshi Isahara,et al.  A Statistical Model for Domain-Independent Text Segmentation , 2001, ACL.

[7]  Stephen E. Robertson,et al.  Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.

[8]  Maria Eskevich,et al.  New Metrics for Meaningful Evaluation of Informally Structured Speech Retrieval , 2012, ECIR.

[9]  Michael Herczeg,et al.  HyLive: Hypervideo-Authoring for Live Television , 2008, EuroITV.

[10]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[11]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[12]  Franciska de Jong,et al.  Infolink: Analysis of Dutch Broadcast News and Cross-Media Browsing , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[13]  Jean-Luc Gauvain,et al.  Speech Processing for Audio Indexing , 2008, GoTAL.

[14]  Martha Larson,et al.  Search and Hyperlinking Task at MediaEval 2012 , 2012, MediaEval.

[15]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[16]  M. de Rijke,et al.  Linking Archives Using Document Enrichment and Term Selection , 2011, TPDL.

[17]  Martha Larson,et al.  Overview of MediaEval 2011 Rich Speech Retrieval Task and Genre Tagging Task , 2011, MediaEval.

[18]  Ryen W. White,et al.  Overview of the CLEF-2006 Cross-Language Speech Retrieval Track , 2006, CLEF.

[19]  Frank M. Shipman,et al.  Authoring, viewing, and generating hypervideo: An overview of Hyper-Hitchcock , 2008, TOMCCAP.

[20]  Maria Eskevich,et al.  Linking inside a video collection: what and how to measure? , 2013, WWW.

[21]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[22]  Harald Kosch,et al.  Towards an easy to use authoring tool for interactive non-linear video , 2012, Multimedia Tools and Applications.

[23]  Cordelia Schmid,et al.  Unsupervised metric learning for face identification in TV video , 2011, 2011 International Conference on Computer Vision.

[24]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  Lynda Hardman,et al.  Modelling and authoring hypermedia documents , 1998 .

[27]  Rong Yan,et al.  Probabilistic models for combining diverse knowledge sources in multimedia retrieval , 2006 .

[28]  Paul Deléglise,et al.  LIUM's systems for the IWSLT 2011 speech translation tasks , 2011, IWSLT.

[29]  Tomoyosi Akiba,et al.  STD based on Hough Transform and SDR using STD results: Experiments at NTCIR-9 SpokenDoc , 2011, NTCIR.

[30]  Martha Larson,et al.  Reading between the tags to predict real-world size-class for visually depicted objects in images , 2011, MM '11.

[31]  Ian E. Smith,et al.  Authoring and Navigating Video in Space and Time , 1997, IEEE Multim..

[32]  Frank M. Shipman,et al.  Designing affordances for the navigation of detail-on-demand hypervideo , 2004, AVI.

[33]  Martha Larson,et al.  Comparing retrieval effectiveness of alternative content segmentation methods for Internet video search , 2012, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI).

[34]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[35]  Thomas Sikora,et al.  Feature-based video key frame extraction for low quality video sequences , 2009, 2009 10th Workshop on Image Analysis for Multimedia Interactive Services.

[36]  Ellen M. Voorhees,et al.  TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing) , 2005 .

[37]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.