An Investigation into Feature Effectiveness for Multimedia Hyperlinking

The increasing amount of archival multimedia content available online is creating increasing opportunities for users who are interested in exploratory search behaviour such as browsing. The user experience with online collections could therefore be improved by enabling navigation and recommendation within multimedia archives, which can be supported by allowing a user to follow a set of hyperlinks created within or across documents. The main goal of this study is to compare the performance of different multimedia features for automatic hyperlink generation. In our work we construct multimedia hyperlinks by indexing and searching textual and visual features extracted from the blip.tv dataset. A user-driven evaluation strategy is then proposed by applying the Amazon Mechanical Turk (AMT) crowdsourcing platform, since we believe that AMT workers represent a good example of "real world" users. We conclude that textual features exhibit better performance than visual features for multimedia hyperlink construction. In general, a combination of ASR transcripts and metadata provides the best results.

[1]  Paul Deléglise,et al.  LIUM's systems for the IWSLT 2011 speech translation tasks , 2011, IWSLT.

[2]  Giuseppe Santucci,et al.  Information Retrieval Meets Information Visualization , 2013, Lecture Notes in Computer Science.

[3]  Martha Larson,et al.  Blip10000: a social video dataset containing SPUG content for tagging and retrieval , 2013, MMSys.

[4]  Andrew Trotman,et al.  Overview of the NTCIR-10 Cross-Lingual Link Discovery Task , 2013, NTCIR.

[5]  Noel E. O'Connor,et al.  DCU Linking Runs at MediaEval 2012 Search and Hyperlinking Task , 2012, MediaEval.

[6]  Maarten de Rijke,et al.  Exploratory Search in an Audio-Visual Archive: Evaluating a Professional Search Tool for Non-Professional Users , 2011, EuroHCIR.

[7]  Gareth J. F. Jones An Introduction to Crowdsourcing for Language and Multimedia Technology Research , 2012, PROMISE Winter School.

[8]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Stuart Macdonald,et al.  User Engagement in Research Data Curation , 2009, ECDL.

[10]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[11]  Omar Alonso,et al.  Crowdsourcing for relevance evaluation , 2008, SIGF.

[12]  Rik Van de Walle,et al.  Multimedia information seeking through search and hyperlinking , 2013, ICMR.

[13]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[14]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[15]  Christopher D. Manning,et al.  Advances in natural language processing , 2015, Science.

[16]  Thomas Sikora,et al.  Feature-based video key frame extraction for low quality video sequences , 2009, 2009 10th Workshop on Image Analysis for Multimedia Interactive Services.

[17]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[18]  M. de Rijke,et al.  Linking Archives Using Document Enrichment and Term Selection , 2011, TPDL.

[19]  Andrew Trotman,et al.  Overview of the NTCIR-9 Crosslink Task: Cross-lingual Link Discovery , 2011, NTCIR.

[20]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[21]  Martha Larson,et al.  Overview of VideoCLEF 2009: New Perspectives on Speech-based Multimedia Content Enrichment , 2009, CLEF.

[22]  Martha Larson,et al.  Search and Hyperlinking Task at MediaEval 2012 , 2012, MediaEval.

[23]  Jean-Luc Gauvain,et al.  Speech Processing for Audio Indexing , 2008, GoTAL.