Classification and representation of semantic content in broadcast tennis videos

This paper investigates the semantic analysis of broadcast tennis footage. We consider the spatio-temporal behaviour of an object in the footage as being the embodiment of a semantic event. This object is tracked using a colour based particle filter. The video syntax and audio features are used to help delineate the temporal boundaries of these events. For broadcast tennis footage, the system firstly parses the video sequence based on the geometry of the content in view and classifies the clip as a particular view type. The temporal behaviour of the serving player is modelled using a HMM. As a result, each model is representative of a particular semantic episode. Events are then summarised using a number of synthesised keyframes.

[1]  Jay J. Lee,et al.  Data-Driven Design of HMM Topology for Online Handwriting Recognition , 2001, Int. J. Pattern Recognit. Artif. Intell..

[2]  JinHyung Kim,et al.  Data-driven Design of HMM Topology for On-line Handwriting Recognition , 2000 .

[3]  Mei Han,et al.  Extract highlights from baseball game video with hidden Markov models , 2002, Proceedings. International Conference on Image Processing.

[4]  S. Marta SOCCER HIGHLIGHTS DETECTION AND RECOGNITION USING HMMs , 2002 .

[5]  Anil C. Kokaram,et al.  Joint audio visual retrieval for tennis broadcasts , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  Patrick Gros,et al.  Temporal structure analysis of broadcast tennis video using hidden Markov models , 2003, IS&T/SPIE Electronic Imaging.

[7]  Anil C. Kokaram,et al.  Modeling high level structure in sports with motion driven HMMs , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Alberto Del Bimbo,et al.  Soccer highlights detection and recognition using HMMs , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.