A Language for Audiovisual Template Specification and Recognition

We address the issue of detecting automatically occurrences of high level patterns in audiovisual documents. These patterns correspond to recurring sequences of shots, which are considered as first class entities by documentalists, and used for annotation and retrieval. We introduce a language for specifying these patterns, based on an extension of Allen's algebra with the regular expression operator +, which denotes an iteration of arbitrary length. We propose a formulation of this pattern language using the constraint satisfaction framework, in which templates are represented as constraint problems. We propose an efficient representation of domains (all subsequences of a given graph) and filtering methods for the Allen constraints. We illustrate the resulting system on a corpus of real world news broadcast examples.

[1]  François Pachet,et al.  The framework approach for constraint satisfaction , 2000, CSUR.

[2]  Alan K. Mackworth Consistency in Networks of Relations , 1977, Artif. Intell..

[3]  Minerva M. Yeung,et al.  Efficient matching and clustering of video shots , 1995, Proceedings., International Conference on Image Processing.

[4]  Stéphane Marchand-Maillet,et al.  Towards a Standard Protocol for the Evaluation of Video-to-Shots Segmentation Algorithms , 1999 .

[5]  Georgios Tziritas,et al.  Face Detection Using Quantized Skin Color Regions Merging and Wavelet Packet Analysis , 1999, IEEE Trans. Multim..

[6]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[7]  Bernard A. Nadel,et al.  Constraint satisfaction algorithms 1 , 1989, Comput. Intell..

[8]  C. Garcia,et al.  Text detection and segmentation in complex color images , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  R. Brunelli,et al.  A Survey on the Automatic Indexing of Video Data, , 1999, J. Vis. Commun. Image Represent..

[10]  François Pachet,et al.  Using Description Logics for Indexing Audiovisual Documents , 1998, Description Logics.

[11]  Philippe Aigrain,et al.  Medium knowledge-based macro-segmentation of video into sequences , 1997 .

[12]  Douglas C. Schmidt,et al.  Implementing application frameworks: object-oriented frameworks at work , 1999 .

[13]  Deborah L. McGuinness,et al.  CLASSIC: a structural data model for objects , 1989, SIGMOD '89.

[14]  Diane J. Litman,et al.  Terminological Reasoning with Constraint Networks and an Application to Plan Recognition , 1992, KR.

[15]  Henry A. Kautz,et al.  Integrating Metric and Qualitative Temporal Reasoning , 1991, AAAI.

[16]  Takeo Kanade,et al.  Intelligent Access to Digital Video: Informedia Project , 1996, Computer.