Automatic annotation of tennis games: An integration of audio, vision, and learning

Fully automatic annotation of tennis game using broadcast video is a task with a great potential but with enormous challenges. In this paper we describe our approach to this task, which integrates computer vision, machine listening, and machine learning. At the low level processing, we improve upon our previously proposed state-of-the-art tennis ball tracking algorithm and employ audio signal processing techniques to detect key events and construct features for classifying the events. At high level analysis, we model event classification as a sequence labelling problem, and investigate four machine learning techniques using simulated event sequences. Finally, we evaluate our proposed approach on three real world tennis games, and discuss the interplay between audio, vision and learning. To the best of our knowledge, our system is the only one that can annotate tennis game at such a detailed level. Fully automatic annotation of real-world tennis videoState-of-the-art tennis ball tracking algorithmAn integration of computer vision, machine listening, and machine learningThe only system that can annotate tennis game at such a detailed level

[1]  N. Owens,et al.  Hawk-eye tennis system , 2003 .

[2]  Yo-Ping Huang,et al.  An intelligent strategy for the automatic detection of highlights in tennis video recordings , 2009, Expert Syst. Appl..

[3]  David Windridge,et al.  A Memory Architecture and Contextual Reasoning Framework for Cognitive Vision , 2005, SCIA.

[5]  Loong Fah Cheong,et al.  A trajectory-based ball detection and tracking algorithm in broadcast tennis video , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[6]  Patrick Bouthemy,et al.  Tennis video abstraction from audio and visual cues , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[7]  William J. Christmas,et al.  Tracking the Evolution of a Tennis Match Using Hidden Markov Models , 2004, SSPR/SPR.

[8]  David Windridge,et al.  Automatic annotation of court games with structured output learning , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[9]  Wen Gao,et al.  Human Behavior Analysis for Highlight Ranking in Broadcast Racket Sports Video , 2007, IEEE Transactions on Multimedia.

[10]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[11]  Patrick Gros,et al.  HMM based structuring of tennis videos using visual and audio cues , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[12]  Chieh-Li Chen,et al.  Tennis Video 2.0: A new presentation of sports videos with content separation and rendering , 2011, J. Vis. Commun. Image Represent..

[13]  Patrick Gros,et al.  Audiovisual integration for tennis broadcast structuring , 2006, Multimedia Tools and Applications.

[14]  Thomas Hofmann,et al.  Discriminative Methods for Label Sequence Learning , 2005 .

[15]  David Windridge,et al.  Improved detection of ball hit events in a tennis game using multimodal information , 2011, AVSP.

[16]  Wen Gao,et al.  Action Recognition in Broadcast Tennis Video , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[17]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[18]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[19]  Josef Kittler,et al.  A System for the Automatic Annotation of Tennis Matches , 2005 .

[20]  Baris David Ekinci,et al.  A ball tracking system for offline tennis videos , 2008 .

[21]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[22]  William J. Christmas,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Layered Data Association Using Graph-theoretic Formulation with Application to Tennis Ball Tracking in Monocular Sequences , 2022 .