Structuring Soccer Video Based on Audio Classification and Segmentation Using Hidden Markov Model

This paper presents a novel scheme for indexing and segmentation of video by analyzing the audio track using Hidden Markov Model. This analysis is then applied to structuring the soccer video. Based on the attributes of soccer video, we define three audio classes in soccer video, namely Game-audio, Advertisement-audio and Studio-audio. For each audio class, a HMM is built using the clip-based 26-coefficients feature stream as observation symbol. The Maximum Likelihood method is then applied for classifying test data using the trained models. Meanwhile, considering that it is highly impossible to change the audio types too suddenly, we apply smoothing rules in final segmentation of an audio sequence. Experimental results indicate that our framework can produce satisfactory results.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  HongJiang Zhang,et al.  Automatic parsing of TV soccer programs , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[3]  Shih-Fu Chang,et al.  Automatic selection of visual features and classifiers , 1999, Electronic Imaging.

[4]  Tsuhan Chen,et al.  Audio Feature Extraction and Analysis for Scene Segmentation and Classification , 1998, J. VLSI Signal Process..

[5]  Lie Lu,et al.  Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..

[6]  Shih-Fu Chang,et al.  Structure analysis of soccer video with hidden Markov models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Shih-Fu Chang,et al.  Structure analysis of sports video using domain models , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..