Event Detection Models Using 2d-BN and CRFs

In this paper, we propose two novel semantic event detection models, i.e., Two-dependence Bayesian Network (2d-BN) and Conditional Random Fields (CRFs). 2d-BN is a simplified Bayesian Network classifier which can characterize the feature relationships well and be trained more efficiently than traditional complex Bayesian Networks. CRFs are undirected probabilistic graphical models which offer several particular advantages including the abilities to relax strong independence assumptions in the state transition and avoid a fundamental limitation of directed probability graphical models. Based on multi-modality fusion and mid-level keywords representation, we use a three-level framework to detect semantic events. The first level extracts audiovisual features, the mid-level detects semantic keywords, and the high-level infers events using 2d-BN and CRFs models. Compared with state of the art, extensive experimental results demonstrate the effectiveness of the proposed two models.

[1]  M. Luo,et al.  Pyramidwise structuring for soccer highlight extraction , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[2]  Tao Wang,et al.  A Novel Sports Video Logo Detector Based on Motion Analysis , 2006, ICONIP.

[3]  A. Murat Tekalp,et al.  Automatic soccer video analysis and summarization , 2003, IEEE Trans. Image Process..

[4]  Shih-Fu Chang,et al.  Structure analysis of soccer video with hidden Markov models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Chng Eng Siong,et al.  Automatic replay generation for soccer video broadcasting , 2004, MULTIMEDIA '04.

[6]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[7]  Bojan Cestnik,et al.  Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[8]  Qi Tian,et al.  A mid-level representation framework for semantic sports video analysis , 2003, ACM Multimedia.

[9]  Mohan S. Kankanhalli,et al.  Creating audio keywords for event detection in soccer video , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[10]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[11]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[12]  Minh Le Nguyen,et al.  FlexCRFs: Flexible Conditional Random Fields , 2005 .

[13]  Xiaokun Li,et al.  A hidden Markov model framework for traffic event detection using video features , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..