Collaborative multimedia analysis for detecting semantical events from broadcasted sports video

In this paper we present an approach towards detecting semantical events from broadcasted sports video through collaborative multimedia analysis, called intermodal collaboration. Broadcasted video can be viewed as a set of multimodal streams such as visual, auditory, and textual (closed caption: CC) streams. By considering temporal dependency between their streams, we aim to improve the reliability and efficiency for event detection. This method consists of three procedural stages: CC stream analysis, auditory stream analysis, and visual stream analysis. In this method, we learn both frequently appearing keywords related to the event from the CC stream and feature parameters characterizing cheering and shouting from the auditory stream. The experimental results for broadcasted sports video of American football games indicate that our approach is effective for event detection.

[1]  Noboru Babaguchi,et al.  Event based indexing of broadcasted sports video by intermodal collaboration , 2002, IEEE Trans. Multim..

[2]  Aaron F. Bobick,et al.  Recognizing Planned, Multiperson Action , 2001, Comput. Vis. Image Underst..

[3]  Takeo Kanade,et al.  Name-It: Naming and Detecting Faces in News Videos , 1999, IEEE Multim..

[4]  Ramesh C. Jain,et al.  Detecting events from continuous media by intermodal collaboration and knowledge use , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[5]  C.-C. Jay Kuo,et al.  Heuristic approach for generic audio data segmentation and annotation , 1999, MULTIMEDIA '99.

[6]  Noboru Babaguchi,et al.  Linking live and replay scenes in broadcasted sports video , 2000, MULTIMEDIA '00.

[7]  Michael A. Smith,et al.  Video skimming and characterization through the combination of image and language understanding techniques , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Wenjun Zeng,et al.  Integrated image and speech analysis for content-based video indexing , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[9]  Alberto Del Bimbo,et al.  Visual information retrieval , 1999 .

[10]  G. West,et al.  On the automated interpretation and indexing of American Football , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[11]  Shingo Uchihashi,et al.  Video Manga: generating semantically meaningful video summaries , 1999, MULTIMEDIA '99.

[12]  Noboru Babaguchi,et al.  Extracting actors, actions and events from sports video -a fundamental approach to story tracking , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[13]  Dragutin Petkovic,et al.  Content-based representation and retrieval of visual media: A state-of-the-art review , 1996, Multimedia Tools and Applications.

[14]  Anoop Gupta,et al.  Automatically extracting highlights for TV Baseball programs , 2000, ACM Multimedia.