Complex volume and pose tracking with probabilistic dynamical models and visual hull constraints

We propose a method for estimating the pose of a human body using its approximate 3D volume (visual hull) obtained in real time from synchronized videos. Our method can cope with loose-fitting clothing, which hides the human body and produces non-rigid motions and critical reconstruction errors, as well as tight-fitting clothing. To follow the shape variations robustly against erratic motions and the ambiguity between a reconstructed body shape and its pose, the probabilistic dynamical model of human volumes is learned from training temporal volumes refined by error correction. The dynamical model of a body pose (joint angles) is also learned with its corresponding volume. By comparing the volume model with an input visual hull and regressing its pose from the pose model, pose estimation can be realized. In our method, this is improved by double volume comparison: 1) comparison in a low-dimensional latent space with probabilistic volume models and 2) comparison in an observation volume space using geometric constrains between a real volume and a visual hull. Comparative experiments demonstrate the effectiveness of our method faster than existing methods.

[1]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Masamichi Shimosaka,et al.  Fast online human pose estimation via 3D voxel data , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Hans-Peter Seidel,et al.  Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Daniel A. Keim,et al.  An experimental effectiveness comparison of methods for 3D similarity search , 2006, International Journal on Digital Libraries.

[5]  Xiaojun Wu,et al.  Parallel Pipeline Volume Intersection for Real-Time 3D Shape Reconstruction on a PC Cluster , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[6]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[7]  Hans-Peter Seidel,et al.  A system for articulated tracking incorporating a clothing model , 2007, Machine Vision and Applications.

[8]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Tracking Using Voxel Data , 2003, International Journal of Computer Vision.

[9]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Human Pose Estimation , 2007, MLMI.

[10]  Ankur Agarwal,et al.  Tracking Articulated Motion Using a Mixture of Autoregressive Models , 2004, ECCV.

[11]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[12]  Michael J. Black,et al.  The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[13]  Philip H. S. Torr,et al.  Regression-Based Human Motion Capture From Voxel Data , 2006, BMVC.

[14]  Ankur Agarwal,et al.  3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  Jitendra Malik,et al.  Recognizing Objects in Range Data Using Regional Point Descriptors , 2004, ECCV.

[17]  Björn Stenger,et al.  Multivariate Relevance Vector Machines for Tracking , 2006, ECCV.

[18]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[19]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[21]  Neil A. Thacker,et al.  Real-time Body Tracking Using a Gaussian Process Latent Variable Model , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[22]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[23]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[24]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[25]  David J. Fleet,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[26]  Michael Isard,et al.  Bayesian Object Localisation in Images , 2001, International Journal of Computer Vision.

[27]  Joaquin Quiñonero Candela,et al.  Local distance preservation in the GP-LVM through back constraints , 2006, ICML.

[28]  Masatsugu Kidode,et al.  Real-Time Shape Analysis of a Human Body in Clothing Using Time-Series Part-Labeled Volumes , 2008, ECCV.

[29]  Marcel Körtgen,et al.  3D Shape Matching with 3D Shape Contexts , 2003 .

[30]  Andrew Blake,et al.  A Probabilistic Exclusion Principle for Tracking Multiple Objects , 2000, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[31]  Takashi Matsuyama,et al.  Deformable Mesh Model for Complex Multi-Object 3D Motion Estimation from Multi-Viewpoint Video , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).