A Gram-Based String Paradigm for Efficient Video Subsequence Search

The unprecedented increase in the generation and dissemination of video data has created an urgent demand for the large-scale video content management system to quickly retrieve videos of users' interests. Traditionally, video sequence data are managed by high-dimensional indexing structures, most of which suffer from the well-known “curse of dimensionality” and lack of support of subsequence retrieval. Inspired by the high efficiency of string indexing methods, in this paper, we present a string paradigm called VideoGram for large-scale video sequence indexing to achieve fast similarity search. In VideoGram, the feature space is modeled as a set of visual words. Each database video sequence is mapped into a string. A gram-based indexing structure is then built to tackle the effect of the “curse of dimensionality” and support video subsequence matching. Given a high-dimensional query video sequence, retrieval is performed by transforming the query into a string and then searching the matched strings from the index structure. By doing so, expensive high-dimensional similarity computations can be completely avoided. An efficient sequence search algorithm with upper bound pruning power is also presented. We conduct an extensive performance study on real-life video collections to validate the novelties of our proposal.

[1]  Aoying Zhou,et al.  An adaptive and dynamic dimensionality reduction method for high-dimensional indexing , 2007, The VLDB Journal.

[2]  Pierre Tirilly,et al.  Language modeling for bag-of-visual words image categorization , 2008, CIVR '08.

[3]  Masatoshi Yoshikawa,et al.  The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation , 2000, VLDB.

[4]  Christian Böhm,et al.  ProVeR: Probabilistic Video Retrieval using the Gauss-Tree , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[5]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[6]  Tao Mei,et al.  VideoSense: towards effective online video advertising , 2007, ACM Multimedia.

[7]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[8]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[9]  Zi Huang,et al.  Bounded coordinate system indexing for real-time video clip search , 2009, TOIS.

[10]  Eamonn J. Keogh,et al.  iSAX: indexing and mining terabyte sized time series , 2008, KDD.

[11]  Beng Chin Ooi,et al.  iDistance: An adaptive B+-tree based indexing method for nearest neighbor search , 2005, TODS.

[12]  Bin Wang,et al.  VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams , 2007, VLDB.

[13]  Zi Huang,et al.  Statistical summarization of content features for fast near-duplicate video detection , 2007, ACM Multimedia.

[14]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[15]  Mei-Chen Yeh,et al.  A string matching approach for visual retrieval and classification , 2008, MIR '08.

[16]  Beng Chin Ooi,et al.  Continuous Content-Based Copy Detection over Streaming Videos , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[18]  Jianping Fan,et al.  Exploring video content structure for hierarchical summarization , 2004, Multimedia Systems.

[19]  Chong-Wah Ngo,et al.  Video event detection using motion relativity and visual relatedness , 2008, ACM Multimedia.

[20]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  Jörg Sander,et al.  A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing , 2005, VLDB.

[22]  Mei-Chen Yeh,et al.  Fast Visual Retrieval Using Accelerated Sequence Matching , 2011, IEEE Transactions on Multimedia.

[23]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[24]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[25]  Jun Sakuma,et al.  Fast approximate similarity search in extremely high-dimensional data sets , 2005, 21st International Conference on Data Engineering (ICDE'05).

[26]  Beng Chin Ooi,et al.  Towards effective indexing for very large video sequence database , 2005, SIGMOD '05.

[27]  Chong-Wah Ngo,et al.  Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval , 2009, Comput. Vis. Image Underst..

[28]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[29]  David C. Gibbon,et al.  Introduction to video search engines , 2008 .

[30]  Zi Huang,et al.  Effective and Efficient Query Processing for Video Subsequence Identification , 2009, IEEE Transactions on Knowledge and Data Engineering.

[31]  Beng Chin Ooi,et al.  Query and Update Efficient B+-Tree Based Indexing of Moving Objects , 2004, VLDB.

[32]  Hao Jiang,et al.  Personalized online document, image and video recommendation via commodity eye-tracking , 2008, RecSys '08.

[33]  Changsheng Xu,et al.  A generic virtual content insertion system based on visual attention analysis , 2008, ACM Multimedia.

[34]  Alberto Del Bimbo,et al.  Video Event Classification Using Bag of Words and String Kernels , 2009, ICIAP.

[35]  Matthieu Cord,et al.  High-dimensional descriptor indexing for large multimedia databases , 2008, CIKM '08.

[36]  Bin Wang,et al.  Cost-based variable-length-gram selection for string collections to support approximate queries efficiently , 2008, SIGMOD Conference.

[37]  Avideh Zakhor,et al.  Efficient video similarity measurement with video signature , 2002, Proceedings. International Conference on Image Processing.

[38]  W. Bruce Croft,et al.  Cluster-based retrieval using language models , 2004, SIGIR '04.

[39]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[40]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.