Leveraging depth cameras and wearable pressure sensors for full-body kinematics and dynamics capture

We present a new method for full-body motion capture that uses input data captured by three depth cameras and a pair of pressure-sensing shoes. Our system is appealing because it is low-cost, non-intrusive and fully automatic, and can accurately reconstruct both full-body kinematics and dynamics data. We first introduce a novel tracking process that automatically reconstructs 3D skeletal poses using input data captured by three Kinect cameras and wearable pressure sensors. We formulate the problem in an optimization framework and incrementally update 3D skeletal poses with observed depth data and pressure data via iterative linear solvers. The system is highly accurate because we integrate depth data from multiple depth cameras, foot pressure data, detailed full-body geometry, and environmental contact constraints into a unified framework. In addition, we develop an efficient physics-based motion reconstruction algorithm for solving internal joint torques and contact forces in the quadratic programming framework. During reconstruction, we leverage Newtonian physics, friction cone constraints, contact pressure information, and 3D kinematic poses obtained from the kinematic tracking process to reconstruct full-body dynamics data. We demonstrate the power of our approach by capturing a wide range of human movements and achieve state-of-the-art accuracy in our comparison against alternative systems.

[1]  Sehoon Ha,et al.  Human motion reconstruction from force sensors , 2011, SCA '11.

[2]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[3]  Reinhard Koch,et al.  Single View Motion Tracking by Depth and Silhouette Information , 2007, SCIA.

[4]  Hans-Peter Seidel,et al.  A Statistical Model of Human Pose and Body Shape , 2009, Comput. Graph. Forum.

[5]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Ming C. Lin,et al.  Deformed distance fields for simulation of non-penetrating flexible bodies , 2001 .

[7]  Gerhard Tröster,et al.  PIMU: A wireless pressure-sensing IMU , 2013, 2013 IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing.

[8]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[9]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[10]  Rómer Rosales,et al.  Specialized mappings and the estimation of human body pose from a single image , 2000, Proceedings Workshop on Human Motion.

[11]  Xiaolin K. Wei,et al.  VideoMocap: modeling physically realistic human motion from monocular video sequences , 2010, ACM Trans. Graph..

[12]  Dinesh K. Pai,et al.  FootSee: an interactive animation system , 2003, SCA '03.

[13]  Rüdiger Dillmann,et al.  Sensor fusion for 3D human body tracking with an articulated 3D body model , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[14]  Jinxiang Chai,et al.  VideoMocap: modeling physically realistic human motion from monocular video sequences , 2010, SIGGRAPH 2010.

[15]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[16]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[17]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[18]  Jitendra Malik,et al.  Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[19]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[20]  Jinxiang Chai,et al.  Accurate realtime full-body motion capture using a single depth camera , 2012, ACM Trans. Graph..

[21]  Hans-Peter Seidel,et al.  A data-driven approach for real-time full body pose reconstruction from a depth camera , 2011, 2011 International Conference on Computer Vision.

[22]  Jessica K. Hodgins,et al.  Video-based 3D motion capture through biped control , 2012, ACM Trans. Graph..

[23]  Odest Chadwicke Jenkins,et al.  Physical simulation for probabilistic motion tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Ronald Fedkiw,et al.  Nonconvex rigid bodies with stacking , 2003, ACM Trans. Graph..

[25]  Ronald Fedkiw,et al.  Simulation of clothing with folds and wrinkles , 2003, SCA '03.

[26]  Ahmed M. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[27]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[28]  Ruigang Yang,et al.  Accurate 3D pose estimation from a single depth image , 2011, 2011 International Conference on Computer Vision.

[29]  David J. Fleet,et al.  The Kneed Walker for human pose tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Vladimir Pavlovic,et al.  Learning Switching Linear Models of Human Motion , 2000, NIPS.

[31]  Lucas Kovar,et al.  Footskate cleanup for motion capture editing , 2002, SCA '02.

[32]  Sebastian Thrun,et al.  Real-time identification and localization of body parts from depth images , 2010, 2010 IEEE International Conference on Robotics and Automation.