Vertical Axis Detection for Sport Video Analytics

When processing video, it is normally assumed that cameras are vertically oriented such that people appear upright, which helps simplify subsequent processing such as person detection. In real situations, due to the need to provide maximum coverage of the viewing space, cameras are usually placed with arbitrary orientations so the apparent vertical axis of the videos captured may not correspond to the true vertical direction of the captured scene. To rectify this situation, we propose a classification-based system, which normalizes the video to compensate for the camera orientation. We demonstrate the performance of the system for outdoor sports video. Our system works as follows: from an arbitrary set of sports videos, we first automatically create a training/testing image dataset, in which players have various orientations. Our classifier is a stacked autoencoder connected to a softmax output layer, which is trained using this dataset to estimate the orientation of players. The orientation of an input video is normalized according to the orientations of player patches, whose angles of orientation are estimated by above trained classifier. Experiments conducted on a hockey field video dataset show that the proposed system is able to estimate the true vertical axis of an input video accurately.

[1]  Stephen A. Dyer,et al.  Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[2]  Peter Kovesi,et al.  Video Surveillance: Legally Blind? , 2009, 2009 Digital Image Computing: Techniques and Applications.

[3]  Sridha Sridharan,et al.  Multi-spectral fusion for surveillance systems , 2008, Comput. Electr. Eng..

[4]  Gang Yu,et al.  Fast action proposals for human action detection and search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Sridha Sridharan,et al.  Activity Modelling in Crowded Environments: A Soft-Decision Approach , 2011, 2011 International Conference on Digital Image Computing: Techniques and Applications.

[6]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[7]  Luca Maria Gambardella,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, High Performance Convolutional Neural Networks for Image Classification , 2022 .

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Sridha Sridharan,et al.  Real-time video event detection in crowded scenes using MPEG derived features: A multiple instance learning approach , 2014, Pattern Recognit. Lett..

[11]  Jianzhong Wu,et al.  Stacked Sparse Autoencoder (SSAE) based framework for nuclei patch classification on breast cancer histopathology , 2014, 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI).