Optical Unmarked Motion Capture Technology Based on Depth Network and Binocular Vision

This paper presents an optical unmarked motion capture method based on depth network and binocular vision. This method optimizes the marked motion capture technology, eliminating the need for additional markers to reduce the complexity of the motion capture system. At the same time, this paper also optimizes the human joint point coding method, which can obtain the sequence numbers and interdependence of 18 human joint points including the toes of the human body. Then we utilize the deep convolutional neural network to extract the coordinates of the two-view 2D human joint points. Through the binocular vision principle and the least squares method, the 3D coordinates of the human joint points are obtained. According to this, the human skeleton model is drawn to reflect the human body motion state.

[1]  Francesc Moreno-Noguer,et al.  3D Human Pose Estimation from a Single Image via Distance Matrix Regression , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Robert B. McGhee,et al.  Design, Implementation, and Experimental Results of a Quaternion-Based Kalman Filter for Human Body Motion Tracking , 2005, IEEE Transactions on Robotics.

[3]  Lourdes Agapito,et al.  Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Deva Ramanan,et al.  3D Human Pose Estimation = 2D Pose Estimation + Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Xiaowei Zhou,et al.  Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[7]  Sergio Escalera,et al.  Poselet-Based Contextual Rescoring for Human Pose Estimation via Pictorial Structures , 2015, International Journal of Computer Vision.

[8]  Neil D. Lawrence,et al.  Hierarchical Gaussian process latent variable models , 2007, ICML '07.

[9]  L. Chen,et al.  An investigation on the accuracy of three-dimensional space reconstruction using the direct linear transformation technique. , 1994, Journal of biomechanics.

[10]  Stefan Carlsson,et al.  3D Pictorial Structures for Multiple View Articulated Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[12]  Cristian Sminchisescu,et al.  Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jie Li,et al.  Human Motion Capture Algorithm Based on Inertial Sensors , 2016, J. Sensors.

[14]  H. M. Karara,et al.  Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-Range Photogrammetry , 2015 .

[15]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  James J. Little,et al.  A Simple Yet Effective Baseline for 3d Human Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.