Video similarity search by using compact representations

The amount of applications using unstructured data, like videos, has been increased, and the researches concerning multimedia retrieval have attracted great attention. The need to efficiently index and retrieve this kind of data is of great concern, due to the fact that common searching approaches based on the use of keywords are not adequate for large video databases. Similarity search is a content based approach and it has been successfully used in retrieval systems. Accordingly, a major challenge is to provide an accurate and compact video representation that can achieve good performance with a fast answer in this type of searching. In this work, we proposed a compact video representation by using Min-Hash and the k-nearest GIST descriptors. Furthermore, we also present the first use of BossaNova Video Descriptor (BNVD) to video similarity search. Both compact video representations have shown more than 88% of mean average precision on similarity video search. The experimental results indicate high efficiency of our proposed representations in video retrieval task.

[1]  Mei-Chen Yeh,et al.  Video copy detection by fast sequence matching , 2009, CIVR '09.

[2]  Christos Faloutsos,et al.  Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes , 2000, EDBT.

[3]  Silvio Jamil Ferzoli Guimarães,et al.  An efficient access method for multimodal video retrieval , 2014, Multimedia Tools and Applications.

[4]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[5]  Hung-Khoon Tan,et al.  Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context , 2009, IEEE Transactions on Multimedia.

[6]  Jiri Matas,et al.  Fast computation of min-Hash signatures for image collections , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Chien-Li Chou,et al.  Pattern-Based Near-Duplicate Video Retrieval and Localization on Web-Scale Videos , 2015, IEEE Transactions on Multimedia.

[8]  Zi Huang,et al.  Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.

[9]  Arnaldo de Albuquerque Araújo,et al.  Pornography detection using BossaNova video descriptor , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[10]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[11]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[12]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[13]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[14]  Arnaldo de Albuquerque Araújo,et al.  Representing local binary descriptors with BossaNova for visual recognition , 2014, SAC.

[15]  R. Roopalakshmi,et al.  A novel spatio-temporal registration framework for video copy localization based on multimodal features , 2013, Signal Process..

[16]  Matthieu Cord,et al.  Pooling in image representation: The visual codeword point of view , 2013, Comput. Vis. Image Underst..