Effective and Efficient Indexing for Large Video Databases

Content based multimedia retrieval is an important topic in database sys- tems. An emerging and challenging topic in this area is the content based search in video data. A video clip can be considered as a sequence of images or frames. Since this representation is too complex to facilitate efficient video retrieval, a video clip is often summarized by a more concise feature representation. In this paper, we trans- form a video clip into a set of probabilistic feature vectors (pfvs). In our case, a pfv corresponds to a Gaussian in the feature space of frames. We demonstrate that this representation is well suited for accurate video retrieval. The use of pfvs allows us to calculate confidence values for frames or sets of frames for being contained within a given video in the database. These confidence values can be employed to specify two types of queries. The first type of query retrieves the videos stored in the database which contain a given set of frames with a probability that is larger than a given thresh- old value. Furthermore, we introduce a probabilistic ranking query retrieving the k database videos which contain the given query set with the highest probabilities. To efficiently process these queries, we introduce query algorithms on set-valued objects. Our solution is based on the Gauss-tree, an index structure for efficiently managing Gaussians in arbitrary vector spaces. Our experimental evaluation demonstrates that sets of probabilistic feature vectors yield a compact and descriptive representation of video clips. Additionally, we show that our new query algorithms outperform compet- itive approaches when answering the given types of queries on a database of over 900 real world video clips.

[1]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[2]  Yufei Tao,et al.  Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions , 2005, VLDB.

[3]  Djoerd Hiemstra,et al.  Probabilistic Approaches to Video Retrieval , 2004, TRECVID.

[4]  Maurice Bruynooghe,et al.  A polynomial time computable metric between point sets , 2001, Acta Informatica.

[5]  Sang Uk Lee,et al.  Efficient video indexing scheme for content-based retrieval , 1999, IEEE Trans. Circuits Syst. Video Technol..

[6]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[7]  Hans-Peter Kriegel,et al.  Using sets of feature vectors for similarity search on voxelized CAD objects , 2003, SIGMOD '03.

[8]  Yufei Tao,et al.  Probabilistic Spatial Queries on Existentially Uncertain Data , 2005, SSTD.

[9]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[10]  Christian Böhm,et al.  Probabilistic Ranking Queries on Gaussians , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[11]  Hayit Greenspan,et al.  A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing , 2002, ECCV.

[12]  Heikki Mannila,et al.  Distance measures for point sets and their computation , 1997, Acta Informatica.

[13]  Christian Böhm,et al.  The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[14]  Jeffrey Scott Vitter,et al.  Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data , 2004, VLDB.

[15]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.