Handling temporal heterogeneous data for content-based management of large video collections

Video document retrieval is now an active part of the domain of multimedia retrieval. However, unlike for other media, the management of a collection of video documents adds the problem of efficiently handling an overwhelming volume of temporal data. Challenges include balancing efficient content modeling and storage against fast access at various levels. In this paper, we detail the framework we have built to accommodate our developments in content-based multimedia retrieval. We show that not only our framework facilitates the development of processing and indexing algorithms but it also opens the way to several other possibilities such as rapid interface prototyping or retrieval algorithm benchmarking. Here, we discuss our developments in relation to wider contexts such as MPEG-7 and the TREC Video Track.

[1]  Ching-Yung Lin,et al.  Video Collaborative Annotation Forum: Establishing Ground-Truth Labels on Large Multimedia Datasets , 2003, TRECVID.

[2]  Timos K. Sellis,et al.  Review - The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles , 2000, ACM SIGMOD Digital Review.

[3]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[4]  Stéphane Marchand-Maillet,et al.  OWL-based reasoning with retractable inference , 2004, RIAO.

[5]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[6]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[7]  Fred Stentiford,et al.  Recent Trends in Video Analysis: A Taxonomy of Video Classification Problems , 2002, IMSA.

[8]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[9]  Stéphane Marchand-Maillet,et al.  Video Content Representation as Salient Regions of Activity , 2004, CIVR.

[10]  Shih-Fu Chang,et al.  VideoQ: an automated content based video search system using visual cues , 1997, MULTIMEDIA '97.

[11]  Yannis Manolopoulos,et al.  Multiversion Linear Quadtree for Spatio-Temporal Data , 2000, ADBIS-DASFAA.

[12]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[13]  Thierry Pun,et al.  Information-Theoretic Framework for The Joint Temporal Partionning and Representation of Video Data , 2003 .

[14]  Z. Meral Özsoyoglu,et al.  Distance-based indexing for high-dimensional metric spaces , 1997, SIGMOD '97.

[15]  Stéphane Marchand-Maillet,et al.  Towards a Standard Protocol for the Evaluation of Video-to-Shots Segmentation Algorithms , 1999 .

[16]  Hans-Jörg Schek,et al.  Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files , 2000, ECDL.

[17]  Thomas S. Huang,et al.  Small sample learning during multimedia retrieval using BiasMap , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  Dieter Pfoser,et al.  Novel Approaches in Query Processing for Moving Object Trajectories , 2000, VLDB 2000.

[19]  Alan F. Smeaton,et al.  Design, implementation and testing of an interactive video retrieval system , 2003, MIR '03.

[20]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.