Detecting events from continuous media by intermodal collaboration and knowledge use

We propose an event network, which is a structured representation oriented for the contents of continuous media, as well as present two methods of detecting events as the first step to construct the network. We deal with sports TV programs, considering American football as a case study. The first method is simple intermodal collaboration: linking between visual and linguistic (closed caption) streams. Using domain knowledge about state transitions of football games, the second method attempts to extract specific visual objects including the information about contents. The experimental results indicate that both methods are effective for event detection.

[1]  Takeo Kanade,et al.  Accessing Video Contents : Cooperative Approach between Image and Natural Language Processing , 1997 .

[2]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[3]  Dragutin Petkovic,et al.  Content-Based Representation and Retrieval of Visual Media: A State-of-the-Art Review , 1996 .

[4]  Stephen W. Smoliar,et al.  Content based video indexing and retrieval , 1994, IEEE MultiMedia.

[5]  Ramesh C. Jain,et al.  Event detection from continuous media , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[6]  Howard D. Wactlar,et al.  Indexing and search of multimodal information , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Michael A. Smith,et al.  Video skimming and characterization through the combination of image and language understanding techniques , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Boon-Lock Yeo,et al.  Retrieving and visualizing video , 1997, CACM.