Athlete pose estimation by non-sequential key-frame propagation

This paper considers the problem of estimating human pose in challenging monocular sports videos, where manual intervention is often required in order to obtain useful results. Fully automatic approaches focus on developing inference algorithms and probabilistic prior models based on learned measurements and often face challenges in generalisation beyond the learned dataset. This work expands on the idea of using an interactive model-based generative technique for accurately estimating the human pose from uncalibrated unconstrained monocular TV sports footage. A method of keyframe propagation is introduced to obtain reliable tracking from limited operator input by introducing the concepts of keyframe propagation and optimal keyframe selection assistance for the operator. Experimental results show that the approach produces results competitive with those produced with twice the number of manually annotated keyframes, halving the amount of interaction required.

[1]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[2]  Patrick Pérez,et al.  View-Independent Action Recognition from Temporal Self-Similarities , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Hao Jiang,et al.  Human Pose Estimation Using Consistent Max Covering , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Philip H. S. Torr,et al.  Randomized trees for human pose detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Jean-Yves Guillemaut,et al.  Athlete Pose Estimation from Monocular TV Sports Footage , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Michael Arens,et al.  Human pose estimation with implicit shape models , 2010, ARTEMIS '10.

[8]  Jonathan Foote,et al.  Discriminative techniques for keyframe selection , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[9]  Adrian Hilton,et al.  Automatic 3D Video Summarization: Key Frame Extraction from Self-Similarity , 2008 .

[10]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Stephen J. McKenna,et al.  Human Pose Estimation Using Partial Configurations and Probabilistic Regions , 2007, International Journal of Computer Vision.

[12]  Andrew Zisserman,et al.  2D Human Pose Estimation in TV Shows , 2009, Statistical and Geometrical Approaches to Visual Motion Analysis.

[13]  Stefano Soatto,et al.  Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Ramakant Nevatia,et al.  Bayesian human segmentation in crowded situations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Adrian Hilton,et al.  Visual Analysis of Humans - Looking at People , 2013 .

[16]  Jean-Christophe Nebel,et al.  Tracking Human Body Parts Using Particle Filters Constrained by Human Biomechanics , 2008, BMVC.

[17]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[18]  Jitendra Malik,et al.  Recovering human body configurations using pairwise constraints between parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[19]  Larry S. Davis,et al.  Real-time periodic motion detection, analysis, and applications , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[20]  Liang Wang,et al.  Informative Shape Representations for Human Action Recognition , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[21]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[23]  Jianbo Shi,et al.  Bottom-up Recognition and Parsing of the Human Body , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Joris M. Mooij,et al.  libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models , 2010, J. Mach. Learn. Res..

[25]  Daniel P. Huttenlocher,et al.  A unified spatio-temporal articulated model for tracking , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[26]  Hazem Wannous,et al.  3D human motion analysis framework for shape similarity and retrieval , 2014, Image Vis. Comput..

[27]  Jesús Martínez del Rincón,et al.  A spatio-temporal 2D-models framework for human pose recovery in monocular sequences , 2008, Pattern Recognit..

[28]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..