论文信息 - Predicting People's 3D Poses from Short Sequences

Predicting People's 3D Poses from Short Sequences

We propose an efficient approach to exploiting motion information from consecutive frames of a video sequence to recover the 3D pose of people. Instead of computing candidate poses in individual frames and then linking them, as is often done, we regress directly from a spatio-temporal block of frames to a 3D pose in the central one. We will demonstrate that this approach allows us to effectively overcome ambiguities and to improve upon the state-of-the-art on challenging sequences.

[1] Antoni B. Chan,et al. 3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network , 2014, ACCV.

[2] Ilya Kostrikov,et al. Depth Sweep Regression Forests for Estimating 3D Human Pose from Images , 2014, BMVC.

[3] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[4] Cristian Sminchisescu,et al. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Deva Ramanan,et al. Learning to parse images of articulated bodies , 2006, NIPS.

[6] Trevor Darrell,et al. Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Ivan Laptev,et al. On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[8] Cristian Sminchisescu,et al. Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Andrew Blake,et al. Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[10] David J. Fleet,et al. 3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11] Sidharth Bhatia,et al. Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12] Michael J. Black,et al. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[13] Michael Isard,et al. Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation , 2011, International Journal of Computer Vision.

[14] Cristian Sminchisescu,et al. Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15] Pascal Fua,et al. Making Action Recognition Robust to Occlusions and Viewpoint Changes , 2010, ECCV.

[16] Dariu Gavrila,et al. Multi-view 3D Human Pose Estimation in Complex Environment , 2011, International Journal of Computer Vision.

[17] Andrew W. Fitzgibbon,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[18] Cordelia Schmid,et al. Estimating Human Pose with Flowing Puppets , 2013, 2013 IEEE International Conference on Computer Vision.

[19] Fernando De la Torre,et al. Spatio-temporal Matching for Human Detection in Video , 2014, ECCV.

[20] Nassir Navab,et al. 3D Pictorial Structures for Multiple Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Pietro Perona,et al. Merging Pose Estimates Across Space and Time , 2013, BMVC.

[22] Toby Sharp,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[23] Cristian Sminchisescu,et al. Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[24] Cristian Sminchisescu,et al. Twin Gaussian Processes for Structured Prediction , 2010, International Journal of Computer Vision.

[25] Ben Taskar,et al. Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[26] Stefan Carlsson,et al. 3D Pictorial Structures for Multiple View Articulated Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Hans-Peter Seidel,et al. Optimization and Filtering for Human Motion Capture , 2010, International Journal of Computer Vision.

[29] David J. Fleet,et al. Shared Kernel Information Embedding for Discriminative Inference , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Ankur Agarwal,et al. 3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[31] Ben Taskar,et al. Parsing human motion with stretchable models , 2011, CVPR 2011.

[32] Cristian Sminchisescu,et al. Fast algorithms for large scale conditional 3D prediction , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33] Cristian Sminchisescu,et al. Chebyshev approximations to the histogram χ2 kernel , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34] David J. Fleet,et al. Dynamical binary latent variable models for 3D human pose tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35] Andrew Zisserman,et al. Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Jason Weston,et al. A general regression technique for learning transductions , 2005, ICML '05.

[37] Deva Ramanan,et al. Exploring Weak Stabilization for Motion Feature Extraction , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38] Alex Smola,et al. Kernel methods in machine learning , 2007, math/0701907.

[39] Michael J. Black,et al. Combined discriminative and generative articulated pose and non-rigid shape estimation , 2007, NIPS.

[40] Cristian Sminchisescu,et al. Semi-supervised Hierarchical Models for 3D Human Pose Reconstruction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41] David A. Forsyth,et al. Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[42] Bernt Schiele,et al. Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.