Toward Marker-Free 3D Pose Estimation in Lifting: A Deep Multi-View Solution

Lifting is a common manual material handling task performed in the workplaces. It is considered as one of the main risk factors for Work-related Musculoskeletal Disorders. To improve work place safety, it is necessary to assess musculoskeletal and biomechanical risk exposures associated with these tasks, which requires very accurate 3D pose. Existing approaches mainly utilize marker-based sensors to collect 3D information. However, these methods are usually expensive to setup, timeconsuming in process, and sensitive to the surrounding environment. In this study, we propose a multi-view based deep perceptron approach to address aforementioned limitations. Our approach consists of two modules: a "view-specific perceptron" network extracts rich information independently from the image of view, which includes both 2D shape and hierarchical texture information; while a "multi-view integration" network synthesizes information from all available views to predict accurate 3D pose. To fully evaluate our approach, we carried out comprehensive experiments to compare different variants of our design. The results prove that our approach achieves comparable performance with former marker-based methods, i.e. an average error of 14:72 ± 2:96 mm on the lifting dataset. The results are also compared with state-of-the-art methods on HumanEva- I dataset [1], which demonstrates the superior performance of our approach.

[1]  Thomas P Andriacchi,et al.  The evolution of methods for the capture of human movement leading to markerless motion capture for biomechanical applications , 2006, Journal of NeuroEngineering and Rehabilitation.

[2]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Silvia Conforto,et al.  Markerless Human Motion Analysis in Gauss–Laguerre Transform Domain: An Application to Sit-To-Stand in Young and Elderly People , 2009, IEEE Transactions on Information Technology in Biomedicine.

[4]  Plamen Angelov,et al.  A Comprehensive Review on Handcrafted and Learning-Based Action Representation Approaches for Human Activity Recognition , 2017 .

[5]  Vishal M. Patel,et al.  Large Margin Multi-Modal Triplet Metric Learning , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[6]  Vishal M. Patel,et al.  Generative adversarial network-based synthesis of visible faces from polarimetrie thermal faces , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[7]  Michael Arens,et al.  Human pose estimation with implicit shape models , 2010, ARTEMIS '10.

[8]  Lourdes Agapito,et al.  Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yu Yang,et al.  PIEFA: Personalized Incremental and Ensemble Face Alignment , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Jonathan Tompson,et al.  Efficient ConvNet-based marker-less motion capture in general scenes with a low number of cameras , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  G. David Ergonomic methods for assessing exposure to risk factors for work-related musculoskeletal disorders. , 2005, Occupational medicine.

[12]  Dimitris N. Metaxas,et al.  Reconstruction-Based Disentanglement for Pose-Invariant Face Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Pascal Fua,et al.  Learning to Fuse 2D and 3D Image Cues for Monocular Body Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Idsart Kingma,et al.  Inter-rater reliability of a video-analysis method measuring low-back load in a field situation. , 2013, Applied ergonomics.

[15]  Emiliano Gambaretto,et al.  Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation , 2010, International Journal of Computer Vision.

[16]  Deva Ramanan,et al.  3D Human Pose Estimation = 2D Pose Estimation + Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ahmed M. Elgammal,et al.  From circle to 3-sphere: Head pose estimation by instance parameterization , 2015, Comput. Vis. Image Underst..

[18]  Antoni B. Chan,et al.  A Robust Likelihood Function for 3D Human Pose Tracking , 2014, IEEE Transactions on Image Processing.

[19]  Wen Gao,et al.  Robust Estimation of 3D Human Poses from a Single Image , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Nojun Kwak,et al.  3D Human Pose Estimation Using Convolutional Neural Networks with 2D Pose Information , 2016, ECCV Workshops.

[21]  Mohammed Bennamoun,et al.  A Gaussian Process Guided Particle Filter for Tracking 3D Human Pose in Video , 2013, IEEE Transactions on Image Processing.

[22]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[23]  S M Hsiang,et al.  Video based lifting technique coding system. , 1998, Ergonomics.

[24]  Cordelia Schmid,et al.  MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild , 2016, NIPS.

[25]  Yichen Wei,et al.  Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Xiaowei Zhou,et al.  Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Vishal M. Patel,et al.  Image De-Raining Using a Conditional Generative Adversarial Network , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[31]  Dawei Li,et al.  DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices , 2017, AAAI.

[32]  Vladimir Pavlovic,et al.  Using a marker-less method for estimating L5/S1 moments during symmetrical lifting. , 2017, Applied ergonomics.

[33]  Rogério Schmidt Feris,et al.  A Recurrent Encoder-Decoder Network for Sequential Face Alignment , 2016, ECCV.

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  J. Kaufman,et al.  Comparison of self-report, video observation and direct measurement methods for upper extremity musculoskeletal disorder physical risk factors , 2001, Ergonomics.

[36]  Xu Xu,et al.  A computer vision based method for 3D posture estimation of symmetrical lifting. , 2018, Journal of biomechanics.

[37]  Antoni B. Chan,et al.  3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network , 2014, ACCV.

[38]  Vincent Lepetit,et al.  Direct Prediction of 3D Body Poses from Motion Compensated Sequences , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Bernt Schiele,et al.  Multi-view Pictorial Structures for 3D Human Pose Estimation , 2013, BMVC.

[40]  Jean-Luc Dugelay,et al.  Learned vs. Hand-Crafted Features for Pedestrian Gender Recognition , 2015, ACM Multimedia.

[41]  Junzhou Huang,et al.  Track Facial Points in Unconstrained Videos , 2016, BMVC.