论文信息 - Summarised hierarchical Markov models for speed-invariant action matching

Summarised hierarchical Markov models for speed-invariant action matching

Action matching, where a recorded sequence is matched against, and synchronised with, a suitable proxy from a library of animations, is a technique for generating a synthetic representation of a recorded human activity. This proxy can then be used to represent the action in a virtual environment or as a prior on further processing of the sequence. In this paper we present a novel technique for performing action matching in outdoor sports environments. Outdoor sports broadcasts are typically multi-camera environments and as such reconstruction techniques can be applied to the footage to generate a 3D model of the scene. However due to poor calibration and matting this reconstruction is of a very low quality. Our technique matches the 3D reconstruction sequence against a predefined library of actions to select an appropriate high quality synthetic representation. A hierarchical Markov model combined with 3D summarisation of the data allows a large number of different actions to be matched successfully to the sequence in a rate-invariant manner without prior segmentation of the sequence into discrete units. The technique is applied to data captured at rugby and soccer games.

[1] Michael P. Wellman,et al. Generalized Queries on Probabilistic Context-Free Grammars , 1996, AAAI/IAAI, Vol. 2.

[2] Adrian Hilton,et al. Automatic 3D Video Summarization: Key Frame Extraction from Self-Similarity , 2008 .

[3] Hideo Saito,et al. Arbitrary viewpoint observation for soccer match video , 2004 .

[4] G. A. Thomas,et al. Real-Time Camera Pose Estimation for Augmenting Sports Scenes , 2006 .

[5] Ian D. Reid,et al. A Multiple View Layered Representation for Dynamic Novel View Synthesis , 2003, BMVC.

[6] Eric Horvitz,et al. Layered representations for human activity recognition , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[7] A. Laurentini,et al. The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Rémi Ronfard,et al. Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9] Yoram Singer,et al. The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.

[10] Adrian Hilton,et al. A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[11] Huaiyu Zhu. On Information and Sufficiency , 1997 .

[12] James J. Little,et al. Simultaneous Tracking and Action Recognition using the PCA-HOG Descriptor , 2006, The 3rd Canadian Conference on Computer and Robot Vision (CRV'06).

[13] Takeo Kanade,et al. Virtualized Reality: Constructing Virtual Worlds from Real Scenes , 1997, IEEE Multim..

[14] Andrew Zisserman,et al. Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[16] Luc Van Gool,et al. Articulated Multi-body Tracking under Egomotion , 2008, ECCV.

[17] Jitendra Malik,et al. Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18] Oliver Grau,et al. Dual-Mode Deformable Models for Free-Viewpoint Video of Sports Events , 2007, Sixth International Conference on 3-D Digital Imaging and Modeling (3DIM 2007).

[19] Adrian Hilton,et al. A Free-Viewpoint Video System for Visualization of Sport Scenes , 2007 .

[20] Vincent Lepetit,et al. Human body pose detection using Bayesian spatio-temporal templates , 2006, Comput. Vis. Image Underst..

[21] Mauro Barbieri,et al. Video summarization: methods and landscape , 2003, SPIE ITCom.

[22] Stefan Carlsson,et al. Monocular 3D Reconstruction of Human Motion in Long Action Sequences , 2004, ECCV.

[23] Jean-Yves Guillemaut,et al. Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.