NII-UIT Browser: A Multimodal Video Search System

We introduce an interactive system for searching a known scene in a video database. The key idea is to enable multimodal search. As the retrieved database is getting larger, using individual modals may not be powerful enough to discriminate a scene with other near duplicates. In our system, a known scene can be described and searched by its visual cues or audio genres. Templates are given for users to rapidly and exactly describe the scene. Moreover, search results are updated instantly as users change the description. As a result, users can generate a large number of possible queries to find the matched scene in a short time.

[1]  Vu Lam,et al.  NII-UIT: A Tool for Known Item Search by Sequential Pattern Filtering , 2014, MMM.

[2]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Hermann Ney,et al.  Features for image retrieval: an experimental comparison , 2008, Information Retrieval.

[5]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[6]  Jakub Lokoc,et al.  Signature-Based Video Browser , 2014, MMM.

[7]  Frank K. Soong,et al.  A segment model based approach to speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.