Automatic Detection and Analysis of Player Action in Moving Background Sports Video Sequences

This paper presents a system for automatically detecting and analyzing complex player actions in moving background sports video sequences, aiming at action-based sports videos indexing and providing kinematic measurements for coach assistance and performance improvement. The system works in a coarse-to-fine fashion. For an input video, in the coarse granularity level, we automatically segment the highlights, that is, the video clips containing the desired action as summaries for general user viewing purposes; in the middle granularity level, we recognize the action types to support action-based video indexing and retrieval; and finally in the fine granularity level, the critical kinematic parameters of player action are obtained for sports professionals' training purposes. However, the complex and dynamic background of sports videos and the complexity of player actions bring considerable difficulty to the automatic analysis. To fulfill such a challenging task, robust algorithms including global motion estimation with adaptive outliers filtering, object segmentation based on adaptive background construction, and automatic human body tracking are proposed in this paper. Two visual analyzing tools: motion panorama and overlay composition, are also introduced. Real diving and jump game videos are used to test the proposed system and algorithms, and the extensive and encouraging experimental results show their effectiveness.

[1]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[2]  Wen Gao,et al.  Human Behavior Analysis for Highlight Ranking in Broadcast Racket Sports Video , 2007, IEEE Transactions on Multimedia.

[3]  James J. Little,et al.  Tracking and recognizing actions of multiple hockey players using the boosted particle filter , 2009, Image Vis. Comput..

[4]  PETER J. ROUSSEEUW,et al.  Computing LTS Regression for Large Data Sets , 2005, Data Mining and Knowledge Discovery.

[5]  Stefan Carlsson,et al.  Recognizing and Tracking Human Action , 2002, ECCV.

[6]  L. R. Rabiner,et al.  A comparative study of several dynamic time-warping algorithms for connected-word recognition , 1981, The Bell System Technical Journal.

[7]  Gu Xu,et al.  An HMM-based framework for video semantic analysis , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[9]  Wu Si,et al.  Automatic Segmentation of Moving Objects in Video Sequences Based on Dynamic Background Construction , 2005 .

[10]  Yves Jean,et al.  LucentVision: converting real world events into multimedia experiences , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[11]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[13]  David J. Fleet,et al.  Monocular 3-D Tracking of the Golf Swing , 2005, CVPR.

[14]  Emiliano Mario Piccinelli,et al.  An innovative, high quality and search window independent motion estimation algorithm and architecture for MPEG-2 encoding , 2000, 2000 Digest of Technical Papers. International Conference on Consumer Electronics. Nineteenth in the Series (Cat. No.00CH37102).

[15]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[16]  Chang-Hsing Lee,et al.  Scene-based event detection for baseball videos , 2007, J. Vis. Commun. Image Represent..

[17]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  Chiou-Ting Hsu,et al.  Mosaics of video sequences with moving objects , 2004, Signal Process. Image Commun..

[19]  Ling Guan,et al.  Quantifying and recognizing human movement patterns from monocular video Images-part I: a new framework for modeling human motion , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[21]  Nandan Parameswaran,et al.  Detecting tactics patterns for archiving tennis video clips , 2004, IEEE Sixth International Symposium on Multimedia Software Engineering.

[22]  Ds Dirk Farin,et al.  Automatic video segmentation employing object/camera modeling techniques , 2005 .

[23]  Brian C. Lovell,et al.  Visual tracking for sports applications , 2005 .

[24]  Tat-Seng Chua,et al.  An unified framework for shot boundary detection via active learning , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[25]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[26]  Qi Tian,et al.  A mid-level representation framework for semantic sports video analysis , 2003, ACM Multimedia.

[27]  Loong Fah Cheong,et al.  Automatic camera calibration of broadcast tennis video with applications to 3D virtual content insertion and ball detection and tracking , 2009, Comput. Vis. Image Underst..

[28]  Shih-Fu Chang,et al.  Structure analysis of soccer video with domain knowledge and hidden Markov models , 2004, Pattern Recognit. Lett..

[29]  Yongdong Zhang,et al.  Semantic and structural analysis of TV diving programs , 2008, Journal of Computer Science and Technology.

[30]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Til Aach,et al.  Statistical model-based change detection in moving video , 1993, Signal Process..

[32]  Hisashi Miyamori,et al.  Video annotation for content-based retrieval using human behavior analysis and domain knowledge , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[33]  Tuan-Kiang Chiew,et al.  Rapid block-based global motion estimation and its applications , 2002, 2002 Digest of Technical Papers. International Conference on Consumer Electronics (IEEE Cat. No.02CH37300).

[34]  Bin Qi,et al.  Robust and fast global motion estimation oriented to video object segmentation , 2005, IEEE International Conference on Image Processing 2005.

[35]  Ricardo M. L. Barros,et al.  Tracking soccer players aiming their kinematical motion analysis , 2006, Comput. Vis. Image Underst..

[36]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..