Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

We address the problem of making human motion capture in the wild more practical by using a small set of inertial sensors attached to the body. Since the problem is heavily under‐constrained, previous methods either use a large number of sensors, which is intrusive, or they require additional video input. We take a different approach and constrain the problem by: (i) making use of a realistic statistical body model that includes anthropometric constraints and (ii) using a joint optimization framework to fit the model to orientation and acceleration measurements over multiple frames. The resulting tracker Sparse Inertial Poser (SIP) enables motion capture using only 6 sensors (attached to the wrists, lower legs, back and head) and works for arbitrary human motions. Experiments on the recently released TNT15 dataset show that, using the same number of sensors, SIP achieves higher accuracy than the dataset baseline without using any video data. We further demonstrate the effectiveness of SIP on newly recorded challenging motions in outdoor scenarios such as climbing or jumping over a wall.

[1]  Nassir Navab,et al.  Discriminative Human Full-Body Pose Estimation from Wearable Inertial Sensor Data , 2009, 3DPH.

[2]  Hongdong Li,et al.  Rotation Averaging , 2013, International Journal of Computer Vision.

[3]  Michael J. Black,et al.  Pose-conditioned joint angle limits for 3D human pose reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andrea Tagliasacchi,et al.  Robust Articulated-ICP for Real-Time Hand Tracking , 2015 .

[5]  Jessica K. Hodgins,et al.  Action capture with accelerometers , 2008, SCA '08.

[6]  Hans-Peter Seidel,et al.  Outdoor human motion capture using inverse kinematics and von mises-fisher sampling , 2011, 2011 International Conference on Computer Vision.

[7]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[8]  Michael J. Black,et al.  Dyna: a model of dynamic human shape in motion , 2015, ACM Trans. Graph..

[9]  Taku Komura,et al.  Real-time Physics-based Motion Capture with Sparse Sensors , 2016, CVMP 2016.

[10]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Matthew Q. Hill,et al.  Body talk , 2016, ACM Trans. Graph..

[12]  Bodo Rosenhahn,et al.  Human Pose Estimation from Video and IMUs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Taehyun Rhee,et al.  Realtime human motion control with a small number of inertial sensors , 2011, SI3D.

[14]  Cristian Sminchisescu,et al.  Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Julius Hannink,et al.  Sensor-Based Gait Parameter Extraction With Deep Convolutional Neural Networks , 2016, IEEE Journal of Biomedical and Health Informatics.

[16]  Yaser Sheikh,et al.  Motion capture from body-mounted cameras , 2011, ACM Trans. Graph..

[17]  Wojciech Matusik,et al.  Practical motion capture in everyday surroundings , 2007, ACM Trans. Graph..

[18]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[19]  Hans-Peter Seidel,et al.  Motion reconstruction using sparse accelerometer data , 2011, TOGS.

[20]  Andrew W. Fitzgibbon,et al.  Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences , 2016, ACM Trans. Graph..

[21]  Hans-Peter Seidel,et al.  EgoCap , 2016, ACM Trans. Graph..

[22]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[23]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[24]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[25]  Jessica K. Hodgins,et al.  Performance animation from low-dimensional control signals , 2005, SIGGRAPH 2005.

[26]  Bodo Rosenhahn,et al.  Ball joints for Marker-less human Motion Capture , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[27]  Bodo Rosenhahn,et al.  Model-Based Pose Estimation , 2011, Visual Analysis of Humans.

[28]  Hans-Peter Seidel,et al.  Real-Time Body Tracking with One Depth Camera and Inertial Sensors , 2013, 2013 IEEE International Conference on Computer Vision.

[29]  Bodo Rosenhahn,et al.  Analyzing and Evaluating Markerless Motion Tracking Using Inertial Sensors , 2010, ECCV Workshops.