Regression-based convolutional 3D pose estimation from single image

Estimation of 3D human pose from a single image is a challenging task because of ambiguities in projection from 3D space to the 2D image plane. A new two-stage deep convolutional neural network-based method is proposed for regressing the distance and angular difference matrices among body joints. Using the angular difference between body joints in addition to the distance between them in articulated objects such as human body can better model the structure of the shapes and increases the modelling capability of the learning method. Experimental results on HumanEva I and Human3.6M datasets show that the proposed method has substantial improvement in the mean per joint position error measure over the state-of-the-art methods.