论文信息 - Studying the impact of sequence clustering on near-duplicate video retrieval: an experimental comparison

Studying the impact of sequence clustering on near-duplicate video retrieval: an experimental comparison

In this paper, we propose studying the impact of clustering on near-duplicate video (NDV) retrieval. The aim is to reduce the search space at retrieval time through a pre-processing clustering step performed on the dataset off-line and retrieving NDVs based on the formed clusters. Our contribution is a novel clustering framework inspired by a bioinformatics technique, namely DNA multiple sequence alignment (MSA). A series of video keyframes in chronological order is represented as an alphabetical genome, analogous to a DNA sequence and MSA is employed to automatically partition the NDVs in a video collection into clusters. After discussing the advantages and shortcomings of the main state-of-the-art clustering approaches for video clustering in the theoretical part of the paper, we empirically evaluate the performance of the proposed MSA-based framework against five clustering algorithms representative of these mainstream approaches: Birch, Cure, Dbscan, Expectation-Maximization and Proclus. Also, we show that our clustering-based approach, while being significantly faster than non-clustering-based n-gram and edit distance NDV retrieval techniques, yields better mean average precision retrieval accuracy.

Yandan Wang | Mohammed Belkhatir

[1] Dimitrios Gunopulos,et al. Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[2] Hung-Khoon Tan,et al. Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context , 2009, IEEE Transactions on Multimedia.

[3] Yan Ke,et al. An efficient parts-based near-duplicate and sub-image retrieval system , 2004, MULTIMEDIA '04.

[4] Aggelos K. Katsaggelos,et al. A Kd-Tree Based Dynamic Indexing Scheme for Video Retrieval and Geometry Matching , 2008, 2008 Proceedings of 17th International Conference on Computer Communications and Networks.

[5] Bashar Tahayna,et al. Near-duplicate video detection featuring coupled temporal and perceptual visual structures and logical inference based matching , 2012, Inf. Process. Manag..

[6] Shuicheng Yan,et al. Near-duplicate keyframe retrieval by nonrigid image matching , 2008, ACM Multimedia.

[7] Xian-Sheng Hua,et al. Robust video signature based on ordinal measure , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[8] Xiao-Ping Zhang,et al. Automatic identification of digital video based on shot-level sequence matching , 2005, MULTIMEDIA '05.

[9] Fei Wang,et al. Real-time large scale near-duplicate web video retrieval , 2010, ACM Multimedia.

[10] R. Doolittle,et al. Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[11] Fei Wang,et al. Million-scale near-duplicate video retrieval system , 2011, ACM Multimedia.