Ball Detection Using Yolo and Mask R-CNN

Many computer vision applications rely on accurate and fast object detection, and in our case, ball detection serves as a prerequisite for action recognition in handball scenes. We compare the performance of two of the state-of-the-art convolutional neural network-based object detectors for the task of ball detection in non-staged, real-world conditions. The comparison is performed in terms of speed and accuracy measures on a dataset comprising custom handball footage and a sample of images obtained from the Internet. The performance of the models is compared with and without additional training with examples from our dataset.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Diane J. Cook,et al.  Transfer learning for activity recognition: a survey , 2013, Knowledge and Information Systems.

[3]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Miran Pobar,et al.  Object detection in sports videos , 2018, 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[6]  Miran Pobar,et al.  Building a labeled dataset for recognition of handball actions using mask R-CNN and STIPS , 2018, 2018 7th European Workshop on Visual Information Processing (EUVIP).

[7]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[8]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[9]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  M. Pobar,et al.  Detection of the leading player in handball scenes using Mask R-CNN and STIPS , 2019, International Conference on Machine Vision.

[12]  Marina Ivasic-Kos,et al.  A knowledge-based multi-layered image annotation system , 2015, Expert Syst. Appl..

[13]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).