论文信息 - From Canonical Poses to 3D Motion Capture Using a Single Camera

From Canonical Poses to 3D Motion Capture Using a Single Camera

We combine detection and tracking techniques to achieve robust 3D motion recovery of people seen from arbitrary viewpoints by a single and potentially moving camera. We rely on detecting key postures, which can be done reliably, using a motion model to infer 3D poses between consecutive detections, and finally refining them over the whole sequence using a generative model. We demonstrate our approach in the cases of golf motions filmed using a static camera and walking motions acquired using a potentially moving one. We will show that our approach, although monocular, is both metrically accurate because it integrates information over many frames and robust because it can recover from a few misdetections.

[1] Andrew W. Fitzgibbon,et al. The Joint Manifold Model for Semi-supervised Multi-valued Regression , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2] A. Elgammal,et al. Body Pose Tracking From Uncalibrated Camera Using Supervised Manifold Learning , 2006 .

[3] Andrea Fossati,et al. Linking Pose and Motion , 2008, ECCV.

[4] David J. Fleet,et al. Physics-Based Person Tracking Using Simplified Lower-Body Dynamics , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Vincent Lepetit,et al. Human body pose detection using Bayesian spatio-temporal templates , 2006, Comput. Vis. Image Underst..

[6] Adrian Hilton,et al. Viewpoint invariant exemplar-based 3D human tracking , 2006, Comput. Vis. Image Underst..

[7] Michael J. Black,et al. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[8] David J. Fleet,et al. People tracking using hybrid Monte Carlo filtering , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9] Cristian Sminchisescu,et al. Fast algorithms for large scale conditional 3D prediction , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Michael J. Black,et al. Learning the Statistics of People in Images and Video , 2003, International Journal of Computer Vision.

[11] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[12] Rui Li,et al. 3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers , 2009, International Journal of Computer Vision.

[13] Stefan Carlsson,et al. Recognizing and Tracking Human Action , 2002, ECCV.

[14] David J. Fleet,et al. Temporal motion models for monocular and multiview 3D human body tracking , 2006, Comput. Vis. Image Underst..

[15] Gang Hua,et al. Tracking articulated body by dynamic Markov network , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16] David A. Forsyth,et al. Tracking People by Learning Their Appearance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Trevor Darrell,et al. Conditional Random People: Tracking Humans with CRFs and Grid Filters , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18] Ahmed M. Elgammal,et al. Inferring 3D body pose from silhouettes using activity manifold learning , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19] Andrew Blake,et al. Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[20] Michael J. Black,et al. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[21] Ankur Agarwal,et al. 3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[22] Ken Shoemake,et al. Animating rotation with quaternion curves , 1985, SIGGRAPH.

[23] Dariu Gavrila,et al. Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[24] Andrew Blake,et al. Markerless motion capture of complex full-body movement for character anima-tion , 2001, CVPR 2000.

[25] Stefan Carlsson,et al. Monocular 3D Reconstruction of Human Motion in Long Action Sequences , 2004, ECCV.

[26] James M. Rehg,et al. Reconstruction of 3D figure motion from 2D correspondences , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[27] Cordelia Schmid,et al. Face detection in a video sequence - a temporal approach , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[28] Michael J. Black,et al. Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[29] Jitendra Malik,et al. Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[30] Trevor Darrell,et al. Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Hans-Peter Seidel,et al. Scaled Motion Dynamics for Markerless Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[32] James M. Rehg,et al. Reconstruction of 3-D Figure Motion from 2-D Correspondences , 2001, CVPR 2001.

[33] Michael Isard,et al. BraMBLe: a Bayesian multiple-blob tracker , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[34] B. Triggs,et al. Tracking Articulated Motion with Piecewise Learned Dynamical Models , 2004 .

[35] Carlo Tomasi,et al. 3D tracking = classification + interpolation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[36] Bernt Schiele,et al. Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37] Michael J. Black,et al. The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[38] David J. Fleet,et al. Physics-Based Human Pose Tracking , 2006 .

[39] Andrew W. Fitzgibbon,et al. Markerless tracking using planar structures in the scene , 2000, Proceedings IEEE and ACM International Symposium on Augmented Reality (ISAR 2000).

[40] Clark F. Olson,et al. Automatic target recognition by matching oriented edge pixels , 1997, IEEE Trans. Image Process..

[41] David J. Fleet,et al. Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[42] Qiang Wang,et al. Learning object intrinsic structure for robust visual tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[43] Rui Li,et al. Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers , 2006, ECCV.

[44] Cristian Sminchisescu,et al. Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[45] Dariu Gavrila,et al. A Bayesian Framework for Multi-cue 3D Object Tracking , 2004, ECCV.

[46] David J. Fleet,et al. 3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[47] Michael J. Black,et al. Learning and Tracking Cyclic Human Motion , 2000, NIPS.

[48] Trevor Darrell,et al. Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.