Audio-Visual Classification of Sports Types

In this work we propose a method for classification of sports types from combined audio and visual features extracted from thermal video. From audio Mel Frequency Cepstral Coefficients (MFCC) are extracted, and PCA are applied to reduce the feature space to 10 dimensions. From the visual modality short trajectories are constructed to represent the motion of players. From these, four motion features are extracted and combined directly with audio features for classification. A k-nearest neighbour classifier is applied for classification of 180 1-minute video sequences from three sports types. Using 10-fold cross validation a correct classification rate of 96.11% is obtained with multimodal features, compared to 86.67% and 90.00% using only visual or audio features, respectively.

[1]  Qi Tian,et al.  News sports video shot classification with sports play field and motion features , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[2]  Regunathan Radhakrishnan,et al.  Effective and efficient sports highlights extraction using the minimum description length criterion in selecting GMM structures , 2004, ICME.

[3]  Liang Bai,et al.  Audio Classification and Segmentation for Sports Video Structure Extraction using Support Vector Machine , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[4]  Changsheng Xu,et al.  Sports Video Analysis: Semantics Extraction, Editorial Content Creation and Adaptation , 2009, J. Multim..

[5]  Chng Eng Siong,et al.  Automatic Sports Video Genre Classification using Pseudo-2D-HMM , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[6]  Anoop Gupta,et al.  Automatically extracting highlights for TV Baseball programs , 2000, ACM Multimedia.

[7]  Thomas B. Moeslund,et al.  Classification of sports types from tracklets , 2014, KDD 2014.

[8]  Hamid Soltanian-Zadeh,et al.  Sport Video Classification Using an Ensemble Classifier , 2011, 2011 7th Iranian Conference on Machine Vision and Image Processing.

[9]  S. Palanivel,et al.  Audio-video based segmentation and classification using SVM , 2012 .

[10]  Ziyou Xiong,et al.  Effective and efficient sports highlights extraction using the minimum description length criterion in selecting GMM structures [audio classification] , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[11]  Ling-yu Duan,et al.  Automatic sports genre categorization and view-type classification over large-scale dataset , 2009, ACM Multimedia.

[12]  Thomas B. Moeslund,et al.  Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Chunru Wan,et al.  The application of edge feature in automatic sports genre classification , 2004, IEEE Conference on Cybernetics and Intelligent Systems, 2004..

[14]  Jae Lee,et al.  Activity Identification Utilizing Data Mining Techniques , 2007, 2007 IEEE Workshop on Motion and Video Computing (WMVC'07).

[15]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[16]  Thomas B. Moeslund,et al.  Classification of Sports Types Using Thermal Imagery , 2014 .

[17]  C. Krishna Mohan,et al.  Event-Based Sports Videos Classification Using HMM Framework , 2014 .

[18]  David S. Doermann,et al.  Sports video classification using HMMS , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[19]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[20]  P. Sanguansat,et al.  TF-RNF: A novel term weighting scheme for sports video classification , 2012, 2012 IEEE International Conference on Signal Processing, Communication and Computing (ICSPCC 2012).

[21]  C. Krishna Mohan,et al.  Classification of sport videos using edge-based features and autoassociative neural network models , 2010, Signal Image Video Process..

[22]  Amir-Masoud Eftekhari-Moghadam,et al.  Multimodal feature extraction and fusion for semantic mining of soccer video: a survey , 2012, Artificial Intelligence Review.