Automatic video genre categorization and event detection techniques on large-scale sports data

This paper presents an efficient and robust automatic process for large-scale sports video analysis. The proposed system firstly identifies the genre of the query video, and then accomplishes the interesting event detection task. The significance of this framework is its automatic characteristic in testing with minimum human involvement in training, as well as the scalability and expansibility in dealing with a large-scale dataset. Domain-knowledge independent local features are extracted from an input video sequence and a histogram based distribution representation is created using the bag-of-visual-words (BoW) model. In genre categorization, k-nearest neighbor (k-NN) classifiers with various dissimilarity measures are assessed and evaluated analytically. For the event detection, a hidden conditional random field (HCRF) structured prediction model is utilized. Overall, this framework demonstrates the efficiency and accuracy in processing voluminous data from sports collection and achieves various tasks in video analysis. It also demonstrates a potential technology transformation from the "laboratory bench" to commercial applications.

[1]  Qi Tian,et al.  The 1st workshop on large-scale multimedia retrieval and mining (LS-MMRM'09) , 2009, MM '09.

[2]  Alberto Messina,et al.  Parallel neural networks for multimodal video genre classification , 2008, Multimedia Tools and Applications.

[3]  Rong Yan,et al.  1st ACM Workshop on Large-Scale Multimedia Retrieval and Mining, LS-MMRM 2009, Co-located with the 2009 ACM International Conference on Multimedia, MM 09: Foreword , 2009, ACM Multimedia.

[4]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Ba Tu Truong,et al.  Automatic genre identification for content-based video categorization , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[6]  Ling-yu Duan,et al.  Automatic sports genre categorization and view-type classification over large-scale dataset , 2009, ACM Multimedia.

[7]  Wen Gao,et al.  A Real-Time Score Detection and Recognition Approach for Broadcast Basketball Video , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[8]  Akihisa Kodate,et al.  Sports video categorizing method using camera motion parameters , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Wolfgang Effelsberg,et al.  Automatic recognition of film genres , 1995, MULTIMEDIA '95.

[11]  Yongmin Li,et al.  Video classification using spatial-temporal features and PCA , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[12]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Chng Eng Siong,et al.  Automatic Sports Video Genre Classification using Pseudo-2D-HMM , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[14]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[15]  Diane J. Cook,et al.  Automatic Video Classification: A Survey of the Literature , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Ehud Rivlin,et al.  Understanding Video Events: A Survey of Methods for Automatic Interpretation of Semantic Occurrences in Video , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[17]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[18]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[20]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  Thomas Sikora,et al.  New Real-Time Approaches for Video-Genre-Classification Using High-Level Descriptors and a Set of Classifiers , 2008, 2008 IEEE International Conference on Semantic Computing.

[22]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[23]  Tao Mei,et al.  Automatic Video Genre Categorization using Hierarchical SVM , 2006, 2006 International Conference on Image Processing.

[24]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[25]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[26]  Josef Kittler,et al.  Hierarchical decision making scheme for sports video categorisation with temporal post-processing , 2004, CVPR 2004.

[27]  Ling Guan,et al.  Semantic retrieval of multimedia by concept languages: treating semantic concepts like words , 2006, IEEE Signal Processing Magazine.

[28]  A. Murat Tekalp,et al.  Automatic soccer video analysis and summarization , 2003, IEEE Trans. Image Process..

[29]  Alan F. Smeaton Techniques used and open challenges to the analysis, indexing and retrieval of digital video , 2007, Inf. Syst..

[30]  Changsheng Xu,et al.  Using Webcast Text for Semantic Event Detection in Broadcast Sports Video , 2008, IEEE Transactions on Multimedia.

[31]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.