Real-time field sports scene classification using colour and frequency space decompositions

This paper presents a novel approach to recognize a scene presented in an image with specific application to scene classification in field sports video. We propose different variants of the algorithm ranging from bags of visual words to the simplified real-time implementation, that takes only the most important areas of similar colour into account. All the variants feature similar accuracy which is comparable to very well-known image indexing techniques like SIFT or HoGs. For the comparison purposes, we also developed a specific database which is now available online. The algorithm is suitable in scene recognition task thanks to changes in speed and robustness to the image resolution, thus, making it a good candidate in real-time video indexing systems. The procedure features high simplicity thanks to the fact that it is based on the very well-known Fourier transform.

[1]  Luc Van Gool,et al.  Tracking People in Broadcast Sports , 2010, DAGM-Symposium.

[2]  Chng Eng Siong,et al.  Automatic composition of broadcast sports video , 2008, Multimedia Systems.

[3]  Tao Mei,et al.  Sports Video Mining with Mosaic , 2005, 11th International Multimedia Modelling Conference.

[4]  Tao Wang,et al.  Semantic Event Detection using Conditional Random Fields , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[5]  Xiao-Feng Tong,et al.  Shot classification in sports video , 2004, Proceedings 7th International Conference on Signal Processing, 2004. Proceedings. ICSP '04. 2004..

[6]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Lionel Lacassagne,et al.  High performance motion detection: some trends toward new embedded architectures for vision systems , 2008, Journal of Real-Time Image Processing.

[8]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Noel E. O'Connor,et al.  MPEG audio bitstream processing towards the automatic generation of sports programme summaries , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[10]  Shumeet Baluja,et al.  Advertisement Detection and Replacement using Acoustic and Visual Repetition , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.

[11]  Jean-Marc Odobez,et al.  Sports Event Recognition Using Layered HMMS , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[12]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[13]  Enrique F. Torres Moreno,et al.  Real-time GPU color-based segmentation of football players , 2011, Journal of Real-Time Image Processing.

[14]  Lionel Lacassagne,et al.  Motion detection: Fast and robust algorithms for embedded systems , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[15]  Axel Gräser,et al.  VF-SIFT: Very Fast SIFT Feature Matching , 2010, DAGM-Symposium.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Joni-Kristian Kämäräinen,et al.  Invariance properties of Gabor filter-based features-overview and applications , 2006, IEEE Transactions on Image Processing.

[18]  Ye Chow Kuang,et al.  Fast and accurate human detection for video applications using edgelets , 2010, 2010 International Conference on Computer Applications and Industrial Electronics.

[19]  Andreas D. Lattner,et al.  Real-Time Spatio-Temporal Analysis of Dynamic Scenes in 3D Soccer Simulation , 2008, RoboCup.

[20]  B. Dunets,et al.  FFT Processor IP Cores synthesis on the base of configurable pipeline architecture , 2003, The Experience of Designing and Application of CAD Systems in Microelectronics, 2003. CADSM 2003. Proceedings of the 7th International Conference..

[21]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[22]  Ubbo Visser,et al.  Real-time spatio-temporal analysis of dynamic scenes , 2012, Knowledge and Information Systems.

[23]  Noel E. O'Connor,et al.  Event detection in field sports video using audio-visual features and a support vector Machine , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Kazimierz Choros,et al.  Content-Based Scene Detection and Analysis Method for Automatic Classification of TV Sports News , 2010, RSCTC.

[25]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[26]  Weiwei Chen,et al.  Optical flow approximation based motion object extraction for MPEG-2 video stream , 2009, Journal of Real-Time Image Processing.

[27]  Qi Tian,et al.  A unified framework for semantic shot classification in sports video , 2002, IEEE Transactions on Multimedia.

[28]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Samrat L Sabat,et al.  Automatic IP Core generation in SoC , 2010 .

[30]  Pavel Zemcík,et al.  Real-time object detection on CUDA , 2010, Journal of Real-Time Image Processing.

[31]  Xavier Anguera Miró,et al.  TV Advertisements Detection and Clustering Based on Acoustic Information , 2008, 2008 International Conference on Computational Intelligence for Modelling Control & Automation.

[32]  Gerard de Haan,et al.  Guest editorial: special issue on algorithms and architectures for real-time image and video enhancement , 2011, Journal of Real-Time Image Processing.

[33]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[34]  B. S. Manjunath,et al.  MPEG‐7 Homogeneous Texture Descriptor , 2001 .

[35]  Luca Fanucci,et al.  A multi-processor NoC-based architecture for real-time image/video enhancement , 2011, Journal of Real-Time Image Processing.

[36]  Shih-Fu Chang,et al.  Real-time view recognition and event detection for sports video , 2004, J. Vis. Commun. Image Represent..

[37]  Maheshkumar H. Kolekar,et al.  Bayesian belief network based broadcast sports video indexing , 2011, Multimedia Tools and Applications.