Gradient Ordinal Signature and Fixed-Point Embedding for Efficient Near-Duplicate Video Detection

In order to meet the requirement of large scale real-time near-duplicate video detection, this paper has achieved two goals. First, this paper proposes a more compact local image descriptor which is termed as gradient ordinal signature (GOS). GOS not only has the advantages of low dimension, simplicity in computation, and high discrimination but also is invariant to mirror reflection, rotation, and scale changes. Second, applying the characteristics of the proposed GOS and combining with the embedding theory of metric spaces, this paper proposes an efficient similarity search method based on the fixed-point embedding (FE). A main advantage of FE is that its parameters have good controllability, and its performance is stable and not sensitive to dataset changes. On the whole, the goal of our approach focuses on the speed rather than the accuracy of near-duplicate video detection. We have evaluated our method on four different settings to verify the two goals. Specifically, the tests include image and video datasets, respectively, to evaluate the performance of GOS. Experimental results demonstrate the effectiveness, efficiency, and lower memory usage of GOS. Furthermore, the third test compares FE with locality sensitivity hashing. FE also shows a speed improvement of about ten times and saves more than 60% in memory usage. The fourth test demonstrates that the combination of GOS and FE for near-duplicate video detection can achieve better overall efficiency than the state-of-the-art methods.

[1]  D. C. Koelma,et al.  TREC Video Retrieval Evaluation : notebook papers and slides , 2013 .

[2]  Chong-Wah Ngo,et al.  Scale-Rotation Invariant Pattern Entropy for Keypoint-Based Near-Duplicate Detection , 2009, IEEE Transactions on Image Processing.

[3]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[4]  Hung-Khoon Tan,et al.  Scalable detection of partial near-duplicate videos by visual-temporal consistency , 2009, ACM Multimedia.

[5]  Xiaochun Cao,et al.  MIFT: A Mirror Reflection Invariant Feature Descriptor , 2009, ACCV.

[6]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[7]  Klemens Böhm,et al.  Trading Quality for Time with Nearest Neighbor Search , 2000, EDBT.

[8]  Divyakant Agrawal,et al.  Approximate nearest neighbor searching in multimedia databases , 2001, Proceedings 17th International Conference on Data Engineering.

[9]  Patrick Gros,et al.  Approximate searches: k-neighbors + precision , 2003, CIKM '03.

[10]  Yan Ke,et al.  Efficient Near-duplicate Detection and Sub-image Retrieval , 2004 .

[11]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[13]  Shree K. Nayar,et al.  Ordinal Measures for Image Correspondence , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[15]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[16]  Ruud M. Bolle,et al.  Comparison of distance measures for video copy detection , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[17]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[18]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[19]  H. Samet Contractive Embedding Methods for Similarity Searching in Metric Spaces , 2000 .

[20]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[21]  Avideh Zakhor,et al.  Fast similarity search and clustering of video sequences on the world-wide-web , 2005, IEEE Transactions on Multimedia.

[22]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[23]  Ingemar J. Cox,et al.  Audio Fingerprinting: Nearest Neighbor Search in High Dimensional Binary Spaces , 2005, J. VLSI Signal Process..

[24]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[25]  J. Lindenstrauss,et al.  Extensions of lipschitz maps into Banach spaces , 1986 .

[26]  Olivier Buisson,et al.  Robust voting algorithm based on labels of behavior for video copy detection , 2006, MM '06.

[27]  B. Vasudev,et al.  Spatiotemporal sequence matching for efficient video copy detection , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[29]  Hung-Khoon Tan,et al.  Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context , 2009, IEEE Transactions on Multimedia.

[30]  Hong Liu,et al.  SVD-SIFT for web near-duplicate image detection , 2010, 2010 IEEE International Conference on Image Processing.

[31]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[32]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[33]  Fred Stentiford,et al.  Video sequence matching based on temporal ordinal measurement , 2008, Pattern Recognit. Lett..

[34]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[35]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[36]  Jun Adachi,et al.  Scene duplicate detection from videos based on trajectories of feature points , 2007, MIR '07.

[37]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[39]  William M. Wells,et al.  SIFT-Rank: Ordinal description for invariant feature correspondence , 2009, CVPR.

[40]  J. Bourgain On lipschitz embedding of finite metric spaces in Hilbert space , 1985 .

[41]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[42]  Hanan Samet,et al.  Properties of Embedding Methods for Similarity Searching in Metric Spaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Jun Sakuma,et al.  Fast approximate similarity search in extremely high-dimensional data sets , 2005, 21st International Conference on Data Engineering (ICDE'05).

[44]  Ittai Abraham,et al.  Advances in metric embedding theory , 2006, STOC '06.

[45]  Beng Chin Ooi,et al.  Hierarchical Indexing Structure for Efficient Similarity Search in Video Retrieval , 2006, IEEE Transactions on Knowledge and Data Engineering.

[46]  Cordelia Schmid,et al.  An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering , 2010, IEEE Transactions on Multimedia.

[47]  Qi Tian,et al.  Fast and Robust Short Video Clip Search for Copy Detection , 2004, PCM.

[48]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[49]  Ruud M. Bolle,et al.  Comparison of sequence matching techniques for video copy detection , 2001, IS&T/SPIE Electronic Imaging.

[50]  Edward Y. Chang,et al.  Clustering for Approximate Similarity Search in High-Dimensional Spaces , 2002, IEEE Trans. Knowl. Data Eng..

[51]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[52]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[53]  Olivier Buisson,et al.  Content-Based Copy Retrieval Using Distortion-Based Probabilistic Similarity Search , 2007, IEEE Transactions on Multimedia.

[54]  Ton Kalker,et al.  Feature Extraction and a Database Strategy for Video Fingerprinting , 2002, VISUAL.

[55]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.