A Graph-Based Method for Soccer Action Spotting Using Unsupervised Player Classification

Action spotting in soccer videos is the task of identifying the specific time when a certain key action of the game occurs. Lately, it has received a large amount of attention and powerful methods have been introduced. Action spotting involves understanding the dynamics of the game, the complexity of events, and the variation of video sequences. Most approaches have focused on the latter, given that their models exploit the global visual features of the sequences. In this work, we focus on the former by (a) identifying and representing the players, referees, and goalkeepers as nodes in a graph, and by (b) modeling their temporal interactions as sequences of graphs. For the player identification, or player classification task, we obtain an accuracy of 97.72% in our annotated benchmark. For the action spotting task, our method obtains an overall performance of 57.83% average-mAP by combining it with other audiovisual modalities. This performance surpasses similar graph-based methods and has competitive results with heavy computing methods. Code and data are available at https://github.com/IPCV/soccer_action_spotting.

[1]  Bernard Ghanem,et al.  DeepGCNs: Making GCNs Go as Deep as CNNs , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Le Kang,et al.  Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection , 2021, ArXiv.

[3]  Bernard Ghanem,et al.  Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  James H. Elder,et al.  Contrastive Learning for Sports Video: Unsupervised Player Classification , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Bernard Ghanem,et al.  Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Simone Calderara,et al.  RMS-Net: Regression and Masking for Soccer Event Spotting , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[7]  Bernard Ghanem,et al.  SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[9]  Michael Stöckl,et al.  Making Offensive Play Predictable-Using a Graph Convolutional Network to Understand Defensive Performance in Soccer , 2021 .

[10]  Frédéric Precioso,et al.  Profiling Actions for Sport Video Summarization: An attention signal analysis , 2020, 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP).

[11]  Stéphane Dupont,et al.  Improved Soccer Action Spotting using both Audio and Video Streams , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Patrick Lucey,et al.  End-to-End Camera Calibration for Broadcast Videos , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Adrià Arbués Sangüesa,et al.  Always Look On The Bright Side Of The Field: Merging Pose And Contextual Data To Estimate Orientation Of Soccer Players , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[14]  Luc Van Gool,et al.  stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Ross B. Girshick,et al.  PointRend: Image Segmentation As Rendering , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Marc Van Droogenbroeck,et al.  A Context-Aware Loss Function for Action Spotting in Soccer Videos , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[18]  Dino Pedreschi,et al.  A public data set of spatio-temporal match events in soccer competitions , 2019, Scientific Data.

[19]  Frédéric Precioso,et al.  A Deep Architecture for Multimodal Summarization of Soccer Games , 2019, MMSports '19.

[20]  Christophe De Vleeschouwer,et al.  Associative Embedding for Team Discrimination , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[23]  Fernando Vieira Paulovich,et al.  The Shape of the Game , 2018, 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI).

[24]  Jordi Luque,et al.  Using Network Science to Analyse Football Passing Networks: Dynamics, Space, Time, and the Multilayer Nature of the Game , 2018, Front. Psychol..

[25]  Bernard Ghanem,et al.  SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26]  James J. Little,et al.  Camera Selection for Broadcasting Soccer Games , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[27]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Aren Jansen,et al.  Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[31]  Hamid Abrishami Moghaddam,et al.  Multi-player detection in soccer broadcast videos using a blob-guided particle swarm optimization method , 2017, Multimedia Tools and Applications.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Sridha Sridharan,et al.  Recognising Team Activities from Noisy Data , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[36]  An Tran,et al.  Long-View Player Detection Framework Algorithm in Broadcast Soccer Videos , 2011, ICIC.

[37]  D. Araújo,et al.  Networks as a novel tool for studying team ball sports as complex social systems. , 2011, Journal of science and medicine in sport.

[38]  Enrique F. Torres Moreno,et al.  Real-time GPU color-based segmentation of football players , 2011, Journal of Real-Time Image Processing.

[39]  Jia Liu,et al.  Automatic player labeling, tracking and field registration and trajectory mapping in broadcast soccer video , 2011, TIST.

[40]  Tiziana D'Orazio,et al.  Football Players Classification in a Multi-camera Environment , 2010, ACIVS.

[41]  Tiziana D'Orazio,et al.  An Investigation Into the Feasibility of Real-Time Soccer Offside Detection From a Multiple Camera System , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  Manuel González,et al.  Affine Invariant Texture Segmentation and Shape from Texture by Variational Methods , 1998, Journal of Mathematical Imaging and Vision.

[43]  Johan Wiklund,et al.  Multidimensional Orientation Estimation with Applications to Texture Analysis and Optical Flow , 1991, IEEE Trans. Pattern Anal. Mach. Intell..