End-to-End Camera Calibration for Broadcast Videos

The increasing number of vision-based tracking systems deployed in production have necessitated fast, robust camera calibration. In the domain of sport, the majority of current work focuses on sports where lines and intersections are easy to extract, and appearance is relatively consistent across venues. However, for more challenging sports like basketball, those techniques are not sufficient. In this paper, we propose an end-to-end approach for single moving camera calibration across challenging scenarios in sports. Our method contains three key modules: 1) area-based court segmentation, 2) camera pose estimation with embedded templates, 3) homography prediction via a spatial transform network (STN). All three modules are connected, enabling end-to-end training. We evaluate our method on a new college basketball dataset and demonstrate state of the art performance in variable and dynamic environments. We also validate our method on the World Cup 2014 dataset to show its competitive performance against the state-of-the-art methods. Lastly, we show that our method is two orders of magnitude faster than the previous state of the art on both datasets.

[1]  James J. Little,et al.  A Two-Point Method for PTZ Camera Calibration in Sports , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[2]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[3]  C. V. Jawahar,et al.  Automated Top View Registration of Broadcast Football Videos , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[4]  James J. Little,et al.  AUTOMATIC RECTIFICATION OF LONG IMAGE SEQUENCES , 2003 .

[5]  S. P. Mudur,et al.  Three-dimensional computer vision: a geometric viewpoint , 1993 .

[6]  Ersin Yumer,et al.  ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Sanja Fidler,et al.  Sports Field Localization via Deep Structured Models , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[9]  Justus H. Piater,et al.  Robust incremental rectification of sports video sequences , 2004, BMVC.

[10]  Jun Zhang,et al.  Method for pan-tilt camera calibration using single control point. , 2015, Journal of the Optical Society of America. A, Optics, image science, and vision.

[11]  Yisong Yue,et al.  Generating Multi-Agent Trajectories using Programmatic Weak Supervision , 2018, ICLR.

[12]  Sridha Sridharan,et al.  Calibrating Cameras in Poor-Conditioned Pitch-Based Sports Games , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[14]  Narendra Ahuja,et al.  ROBUST VIDEO REGISTRATION APPLIED TO FIELD-SPORTS VIDEO ANALYSIS , 2012 .

[15]  James J. Little,et al.  Sports Camera Calibration via Synthetic Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Kenneth Levenberg A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[17]  Yaser Sheikh,et al.  Point-less calibration: Camera parameters from gradient-based alignment to edge images , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).

[18]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Justus H. Piater,et al.  On-Line Rectification of Sport Sequences with Moving Cameras , 2007, MICAI.

[20]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[21]  Hung-Kuo Chu,et al.  Court Reconstruction for Camera Calibration in Broadcast Basketball Videos. , 2016, IEEE transactions on visualization and computer graphics.

[22]  Patrick Lucey,et al.  Quantifying the Value of Transitions in Soccer via Spatiotemporal Trajectory Clustering , 2018 .

[23]  Graham A. Thomas,et al.  Real-time camera tracking using sports pitch markings , 2007, Journal of Real-Time Image Processing.

[24]  Alan Fern,et al.  Improved Video Registration using Non-Distinctive Local Image Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  James J. Little,et al.  Using Line and Ellipse Features for Rectification of Broadcast Hockey Video , 2011, 2011 Canadian Conference on Computer and Robot Vision.

[26]  Marios Savvides,et al.  Faster than Real-Time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).