Light Cascaded Convolutional Neural Networks for Accurate Player Detection

Vision based player detection is important in sports applications. Accuracy, efficiency, and low memory consumption are desirable for real-time tasks such as intelligent broadcasting and automatic event classification. In this paper, we present a cascaded convolutional neural network (CNN) that satisfies all three of these requirements. Our method first trains a binary (player/non-player) classification network from labeled image patches. Then, our method efficiently applies the network to a whole image in testing. We conducted experiments on basketball and soccer games. Experimental results demonstrate that our method can accurately detect players under challenging conditions such as varying illumination, highly dynamic camera movements and motion blur. Comparing with conventional CNNs, our approach achieves state-of-the-art accuracy on both games with 1000x fewer parameters (i.e., it is light}.

[1]  Lior Wolf,et al.  Learning to Count with CNN Boosting , 2016, ECCV.

[2]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[3]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ja-Ling Wu,et al.  WOW: wild-open warning for broadcast basketball video based on player trajectory , 2009, MM '09.

[5]  Zdravko Ivankovic,et al.  Automatic player position detection in basketball games , 2013, Multimedia Tools and Applications.

[6]  Shih-Fu Chang,et al.  Real-time view recognition and event detection for sports video , 2004, J. Vis. Commun. Image Represent..

[7]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[8]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  James J. Little,et al.  Where should cameras look at soccer games: Improving smoothness using the overlapped hidden Markov model , 2017, Comput. Vis. Image Underst..

[11]  Anelia Angelova,et al.  Real-Time Pedestrian Detection with Deep Network Cascades , 2015, BMVC.

[12]  Qi Zhang,et al.  Condensation-based multi-person detection and tracking with HOG and LBP , 2014, 2014 IEEE International Conference on Information and Automation (ICIA).

[13]  Christophe De Vleeschouwer,et al.  Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera , 2017, Comput. Vis. Image Underst..

[14]  Shipeng Yu,et al.  Designing efficient cascaded classifiers: tradeoff between accuracy and cost , 2010, KDD.

[15]  Marc Schlipsing,et al.  Adaptive pattern recognition in real-time video-based soccer analysis , 2017, Journal of Real-Time Image Processing.

[16]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[17]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Yaser Sheikh,et al.  Monocular Object Detection Using 3D Geometric Primitives , 2012, ECCV.

[19]  Christophe De Vleeschouwer,et al.  Personalized production of basketball videos from multi-sensored data under limited display resolution , 2010, Comput. Vis. Image Underst..

[20]  James J. Little,et al.  Learning to Track and Identify Players from Broadcast Sports Videos , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[24]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[25]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[26]  Shu Wang,et al.  Multispectral Deep Neural Networks for Pedestrian Detection , 2016, BMVC.

[27]  Adrian Hilton,et al.  Computer vision for sports: Current applications and research topics , 2017, Comput. Vis. Image Underst..

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[30]  Pinar Duygulu Sahin,et al.  Sentioscope: A Soccer Player Tracking System Using Model Field Particles , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  Hamid Abrishami Moghaddam,et al.  A survey on player tracking in soccer videos , 2017, Comput. Vis. Image Underst..

[32]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[33]  Xiaolin Hu,et al.  Joint Training of Cascaded CNN for Face Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Slawomir Mackowiak Segmentation of Football Video Broadcast , 2013 .

[36]  Yuxia Wang,et al.  Online Learned Player Recognition Model Based Soccer Player Tracking and Labeling for Long-Shot Scenes , 2014, IEICE Trans. Inf. Syst..