A hybrid framework for event detection using multi-modal features

The paper presents a novel approach for event detection in sports videos by topic based graphical model learning. The characteristics features defining various sport events are extracted by contextual grouping of low-level video and audio features using topic modeling. Event detection is performed by learning the structure of context based distribution of characteristic features by CRF based graphical model. Experimental evaluation of the proposed concept is presented on recorded video of Handball and Soccer game.

[1]  P. Mermelstein,et al.  Distance measures for speech recognition, psychological and instrumental , 1976 .

[2]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Caifeng Shan,et al.  An event-based approach to multi-modal activity modeling and recognition , 2010, 2010 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[4]  Cordelia Schmid,et al.  Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[5]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Noel E. O'Connor,et al.  Event detection in field sports video using audio-visual features and a support vector Machine , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Daphna Weinshall,et al.  Identifying Surprising Events in Videos Using Bayesian Topic Models , 2010, ACCV.

[10]  M. Hunt,et al.  Distance measures for speech recognition , 1989 .

[11]  Manuela M. Veloso,et al.  Conditional random fields for activity recognition , 2007, AAMAS '07.

[12]  Santanu Chaudhury,et al.  A Novel Learning-Based Framework for Detecting Interesting Events in Soccer Videos , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[13]  B. Li,et al.  Event detection and summarization in sports video , 2001, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL 2001).

[14]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[15]  Yi-Ping Phoebe Chen,et al.  Sports video summarization using highlights and play-breaks , 2003, MIR '03.

[16]  Yi-Ping Phoebe Chen,et al.  Knowledge-Discounted Event Detection in Sports Video , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[17]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[18]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[19]  Yi Ding,et al.  Event detection in sports video based on generative-discriminative models , 2009, EiMM '09.

[20]  Mohan S. Kankanhalli,et al.  Event detection in basketball video using multiple modalities , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[21]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.