Integrated semantic-syntactic video modeling for search and browsing

Video processing and computer vision communities usually employ shot-based or object-based structural video models and associate low-level (color, texture, shape, and motion) and semantic descriptions (textual annotations) with these structural (syntactic) elements. Database and information retrieval communities, on the other hand, employ entity-relation or object-oriented models to model the semantics of multimedia documents. This paper proposes a new generic integrated semantic-syntactic video model to include all of these elements within a single framework to enable structured video search and browsing combining textual and low-level descriptors. The proposed model includes semantic entities (video objects and events) and the relations between them. We introduce a new "actor" entity to enable grouping of object roles in specific events. This context-dependent classification of attributes of an object allows for more efficient browsing and retrieval. The model also allows for decomposition of events into elementary motion units and elementary reaction/interaction units in order to access mid-level semantics and low-level video features. The instantiations of the model are expressed as graphs. Users can formulate flexible queries that can be translated into such graphs. Alternatively, users can input query graphs by editing an abstract model (model template). Search and retrieval is accomplished by matching the query graph with those instantiated models in the database. Examples and experimental results are provided to demonstrate the effectiveness of the proposed integrated modeling and querying framework.

[1]  Alberto Del Bimbo,et al.  Semantics in Visual Information Retrieval , 1999, IEEE Multim..

[2]  Rune Hjelsvold,et al.  Modelling and Querying Video Data , 1994, VLDB.

[3]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  A. Murat Tekalp,et al.  Automatic soccer video analysis and summarization , 2003, IEEE Trans. Image Process..

[5]  Ramesh C. Jain,et al.  A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video , 2002, Pattern Recognit..

[6]  Stephen W. Smoliar,et al.  Content based video indexing and retrieval , 1994, IEEE MultiMedia.

[7]  Clement T. Yu,et al.  Techniques and Systems for Image and Video Retrieval , 1999, IEEE Trans. Knowl. Data Eng..

[8]  Mohand-Said Hacid,et al.  A database approach for modeling and querying video data , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[9]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[10]  Simone Santini,et al.  Integrated browsing and querying for image databases , 2000, IEEE MultiMedia.

[11]  T. Ozsu,et al.  Modeling of video spatial relationships in an object database management system , 1996, Proceedings of International Workshop on Multimedia Database Management Systems.

[12]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[13]  Sethuraman Panchanathan,et al.  Review of Image and Video Indexing Techniques , 1997, J. Vis. Commun. Image Represent..

[14]  Charles A. Bouman,et al.  ViBE: a video indexing and browsing environment , 1999, Optics East.

[15]  Shih-Fu Chang,et al.  Generating semantic visual templates for video databases , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[16]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[17]  Sharad Mehrotra,et al.  Query reformulation for content based multimedia retrieval in MARS , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[18]  Jan Van den Bussche,et al.  Concepts for Graph-Oriented Object Manipulation , 1992, EDBT.

[19]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[20]  A. Murat Tekalp,et al.  Temporal segmentation of video objects for hierarchical object-based motion description , 2002, IEEE Trans. Image Process..

[21]  M. Egenhofer,et al.  Point-Set Topological Spatial Relations , 2001 .

[22]  Laurian M. Chirica,et al.  The entity-relationship model: toward a unified view of data , 1975, SIGF.

[23]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[24]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[25]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[26]  A. Murat Tekalp,et al.  Automatic extraction of low-level object motion descriptors , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[27]  Amarnath Gupta,et al.  Virage image search engine: an open framework for image management , 1996, Electronic Imaging.

[28]  Arbee L. P. Chen,et al.  Semantic video model for content-based retrieval , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[29]  Alex Pentland,et al.  Photobook: tools for content-based manipulation of image databases , 1994, Electronic Imaging.

[30]  John R. Smith,et al.  Conceptual modeling of audio-visual content , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[31]  Shi-Kuo Chang,et al.  Iconic Indexing by 2-D Strings , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Marc Gyssens,et al.  A graph-oriented object database model , 1990, IEEE Trans. Knowl. Data Eng..

[33]  A. Murat Tekalp,et al.  Framework for tracking and analysis of soccer video , 2002, IS&T/SPIE Electronic Imaging.

[34]  Atsuo Yoshitaka,et al.  A Survey on Content-Based Retrieval for Multimedia Databases , 1999, IEEE Trans. Knowl. Data Eng..

[35]  Rangasami L. Kashyap,et al.  Models for motion-based video indexing and retrieval , 2000, IEEE Trans. Image Process..

[36]  Masahito Hirakawa,et al.  VIOLONE: Video Retrieval by Motion Example , 1996, J. Vis. Lang. Comput..

[37]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[38]  K. Selçuk Candan,et al.  The Advanced Video Information System: data structures and query processing , 1996, Multimedia Systems.

[39]  Milind R. Naphade,et al.  A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[40]  Ingemar J. Cox,et al.  Correction to "the Bayesian image retrieval system, pichunter: theory, implementation, and psychophysical experiments" , 2000, IEEE Transactions on Image Processing.

[41]  Katsumi Tanaka,et al.  OVID: Design and Implementation of a Video-Object Database System , 1993, IEEE Trans. Knowl. Data Eng..

[42]  Janusz R. Getta,et al.  Semantic modeling for video content-based retrieval systems , 2000, Proceedings 23rd Australasian Computer Science Conference. ACSC 2000 (Cat. No.PR00518).