VERGE: A Multimodal Interactive Search Engine for Video Browsing and Retrieval

This paper presents VERGE interactive search engine, which is capable of browsing and searching into video content. The system integrates content-based analysis and retrieval modules such as video shot segmentation, concept detection, clustering, as well as visual similarity and object-based search.

[1]  Fionn Murtagh,et al.  Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion? , 2011, Journal of Classification.

[2]  Kuo-Chin Fan,et al.  A motion-tolerant dissolve detection algorithm , 2005, IEEE Transactions on Multimedia.

[3]  Ioannis Patras,et al.  Local Features and a Two-Layer Stacking Architecture for Semantic Concept Detection in Video , 2015, IEEE Transactions on Emerging Topics in Computing.

[4]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Georges Quénot,et al.  Re-ranking by local re-scoring for video indexing and retrieval , 2011, CIKM '11.

[6]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[7]  Yiannis Kompatsiaris,et al.  Fast Visual Vocabulary Construction for Image Retrieval Using Skewed-Split k-d Trees , 2016, MMM.

[8]  Soon-Heung Jung,et al.  Wipe scene-change detector based on visual rhythm spectrum , 2009, IEEE Transactions on Consumer Electronics.

[9]  Klaus Schöffmann,et al.  A User-Centric Media Retrieval Competition: The Video Browser Showdown 2012-2014 , 2014, IEEE Multim..

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Vasileios Mezaris,et al.  Fast shot segmentation combining global and local visual descriptors , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[14]  Ioannis Patras,et al.  Cascade of classifiers based on binary, non-binary and deep convolutional network descriptors for video concept detection , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[15]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).