SOM-Hunter: Video Browsing with Relevance-to-SOM Feedback Loop

This paper presents a prototype video retrieval engine focusing on a simple known-item search workflow, where users initialize the search with a query and then use an iterative approach to explore a larger candidate set. Specifically, users gradually observe a sequence of displays and provide feedback to the system. The displays are dynamically created by a self organizing map that employs the scores based on the collected feedback, in order to provide a display matching the user preferences. In addition, users can inspect various other types of specialized displays for exploitation purposes, once promising candidates are found.

[1]  Yongdong Zhang,et al.  Dense 3D-Convolutional Neural Network for Person Re-Identification in Videos , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[2]  George Awad,et al.  On Influential Trends in Interactive Video Retrieval: Video Browser Showdown 2015–2017 , 2018, IEEE Transactions on Multimedia.

[3]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Heiko Schuldt,et al.  Deep Learning-Based Concept Detection in vitrivr , 2018, MMM.

[6]  Přemysl Čech,et al.  A Framework for Effective Known-item Search in Video , 2019, ACM Multimedia.

[7]  Tat-Seng Chua,et al.  Mental Visual Browsing , 2016, MMM.

[8]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[9]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[10]  Klaus Schöffmann,et al.  Autopiloting Feature Maps: The Deep Interactive Video Exploration (diveXplore) System at VBS2019 , 2019, MMM.

[11]  Dumitru Erhan,et al.  Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Xirong Li,et al.  W2VV++: Fully Deep Learning for Ad-hoc Video Search , 2019, ACM Multimedia.

[13]  Ralph Gasser,et al.  Interactive Search or Sequential Browsing? A Detailed Analysis of the Video Browser Showdown 2018 , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[14]  Kai Uwe Barthel,et al.  Fusing Keyword Search and Visual Exploration for Untagged Videos , 2018, MMM.

[15]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.