Regression Based Pose Estimation with Automatic Occlusion Detection and Rectification

Human pose estimation is a classic problem in computer vision. Statistical models based on part-based modelling and the pictorial structure framework have been widely used recently for articulated human pose estimation. However, the performance of these models has been limited due to the presence of self-occlusion. This paper presents a learning-based framework to automatically detect and recover self-occluded body parts. We learn two different models: one for detecting occluded parts in the upper body and another one for the lower body. To solve the key problem of knowing which parts are occluded, we construct Gaussian Process Regression (GPR) models to learn the parameters of the occluded body parts from their corresponding ground truth parameters. Using these models, the pictorial structure of the occluded parts in unseen images is automatically rectified. The proposed framework outperforms a state-of-the-art pictorial structure approach for human pose estimation on 3 different datasets.

[1]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[3]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[4]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[5]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[6]  Vittorio Ferrari,et al.  We Are Family: Joint Pose Estimation of Multiple Persons , 2010, ECCV.

[7]  H. Bhaskar,et al.  Human body parts tracking using pictorial structures and a genetic algorithm , 2008, 2008 4th International IEEE Conference Intelligent Systems.

[8]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[9]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[11]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Roland Göcke,et al.  Facial Performance Transfer via Deformable Models and Parametric Correspondence , 2012, IEEE Transactions on Visualization and Computer Graphics.

[13]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[14]  Andrew Zisserman,et al.  Learning Layered Pictorial Structures from Video , 2004, ICVGIP.

[15]  Thomas B. Moeslund,et al.  Pose Estimation of Interacting People using Pictorial Structures , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Andrew Zisserman,et al.  2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images , 2012, International Journal of Computer Vision.

[21]  Andrew Zisserman,et al.  Pose search: Retrieving people using their pose , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Michael J. Black,et al.  Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Ville Ojansivu,et al.  Blur Insensitive Texture Classification Using Local Phase Quantization , 2008, ICISP.

[24]  Tamás D. Gedeon,et al.  Emotion recognition using PHOG and LPQ features , 2011, Face and Gesture 2011.

[25]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..