论文信息 - The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation

The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation

Fitting an articulated model to image data is often approached as an optimization over both model pose and model-to-image correspondence. For complex models such as humans, previous work has required a good initialization, or an alternating minimization between correspondence and pose. In this paper we investigate one-shot pose estimation: can we directly infer correspondences using a regression function trained to be invariant to body size and shape, and then optimize the model pose just once? We evaluate on several challenging single-frame data sets containing a wide variety of body poses, shapes, torso rotations, and image cropping. Our experiments demonstrate that one-shot pose estimation achieves state of the art results and runs in real-time.

[1] W. Kabsch. A solution for the best rotation to relate two sets of vectors , 1976 .

[2] Michael J. Black,et al. On the unification of line processes , 1996 .

[3] Yali Amit,et al. Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[4] Dorin Comaniciu,et al. Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5] Leo Breiman,et al. Random Forests: Finding Quasars , 2003 .

[6] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[7] Ian D. Reid,et al. Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[8] Ankur Agarwal,et al. 3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9] Michael J. Black,et al. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision , 1996, International Journal of Computer Vision.

[10] Jitendra Malik,et al. Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[11] Vincent Lepetit,et al. Real-time nonrigid surface detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12] Vincent Lepetit,et al. Randomized trees for real-time keypoint recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13] Hans-Peter Seidel,et al. Optimization and Filtering for Human Motion Capture , 2010, International Journal of Computer Vision.

[14] Luca Ballan,et al. Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes , 2008 .

[15] Juergen Gall,et al. Optimization and Filtering for Human Motion Capture A Multi-Layer Framework , 2008 .

[16] Trevor Darrell,et al. Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Emiliano Gambaretto,et al. Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation , 2010, International Journal of Computer Vision.

[18] Emmanuel Prados,et al. Gradient Flows for Optimizing Triangular Mesh-based Surfaces: Applications to 3D Reconstruction Problems Dealing with Visibility , 2011, International Journal of Computer Vision.

[19] Raquel Urtasun,et al. Combining discriminative and generative methods for 3D deformable surface and articulated pose reconstruction , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20] Sebastian Thrun,et al. Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21] N. Crato. The Vitruvian Man , 2010 .

[22] Bodo Rosenhahn,et al. Efficient and Robust Shape Matching for Model Based Human Motion Capture , 2011, DAGM-Symposium.

[23] Ruigang Yang,et al. Accurate 3D pose estimation from a single depth image , 2011, 2011 International Conference on Computer Vision.

[24] Andrew W. Fitzgibbon,et al. Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[25] Toby Sharp,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[26] Hans-Peter Seidel,et al. A data-driven approach for real-time full body pose reconstruction from a depth camera , 2011, 2011 International Conference on Computer Vision.

[27] Antonio Criminisi,et al. Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[28] Chio-Tan Kuo,et al. Real Time Non-rigid Surface Detection Based on Binary Robust Independent Elementary Features , 2014, 2014 International Symposium on Computer, Consumer and Control.