Known-Item Search in Video Databases with Textual Queries

In this paper, we present two approaches for known-item search in video databases with textual queries. In the first approach, we require the database objects to be labeled with an arbitrary ImageNet classification model. During the search, the set of query words is expanded with synonyms and hypernyms until we encounter words present in the database which are consequently searched for. In the second approach, we delegate the query to an independent database such as Google Images and let the user pick a suitable result for query-by-example search. Furthermore, the effectiveness of the proposed approaches is evaluated in a user study.

[1]  Kai Uwe Barthel,et al.  Navigating a Graph of Scenes for Exploring Large Video Collections , 2016, MMM.

[2]  Apostol Natsev,et al.  Exploring Automatic Query Refinement for Text-Based Video Retrieval , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[3]  Klaus Schöffmann,et al.  Video Interaction Tools , 2015, ACM Comput. Surv..

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Klaus Schöffmann,et al.  A User-Centric Media Retrieval Competition: The Video Browser Showdown 2012-2014 , 2014, IEEE Multim..

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Yiannis Kompatsiaris,et al.  VERGE: An Interactive Search Engine for Browsing Video Collections , 2014, MMM.

[8]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[11]  Martin Krulis,et al.  Efficient extraction of clustering-based feature signatures using GPU architectures , 2015, Multimedia Tools and Applications.

[12]  Chee Sun Won,et al.  Efficient use of local edge histogram descriptor , 2000, MULTIMEDIA '00.

[13]  Wolfgang Hürst,et al.  A Storyboard-Based Interface for Mobile Video Browsing , 2015, MMM.

[14]  Jakub Lokoc,et al.  Multi-sketch Semantic Video Browser , 2016, MMM.

[15]  Sanja Fidler,et al.  Visual Semantic Search: Retrieving Videos via Complex Textual Queries , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[17]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.