论文信息 - Occlusion-Aware Human Pose Estimation with Mixtures of Sub-Trees

Occlusion-Aware Human Pose Estimation with Mixtures of Sub-Trees

In this paper, we study the problem of learning a model for human pose estimation as mixtures of compositional sub-trees in two layers of prediction. This involves estimating the pose of a sub-tree followed by identifying the relationships between sub-trees that are used to handle occlusions between different parts. The mixtures of the sub-trees are learnt utilising both geometric and appearance distances. The Chow-Liu (CL) algorithm is recursively applied to determine the inter-relations between the nodes and to build the structure of the sub-trees. These structures are used to learn the latent parameters of the sub-trees and the inference is done using a standard belief propagation technique. The proposed method handles occlusions during the inference process by identifying overlapping regions between different sub-trees and introducing a penalty term for overlapping parts. Experiments are performed on three different datasets: the Leeds Sports, Image Parse and UIUC People datasets. The results show the robustness of the proposed method to occlusions over the state-of-the-art approaches.

Roland Göcke | Abhinav Dhall | Ibrahim Radwan

[1] David A. Forsyth,et al. Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2] David A. Forsyth,et al. Improved Human Parsing with a Full Relational Model , 2010, ECCV.

[3] Yi Yang,et al. Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[4] Mark Everingham,et al. Learning effective human pose estimation from inaccurate annotation , 2011, CVPR 2011.

[5] Andrew Blake,et al. "GrabCut" , 2004, ACM Trans. Graph..

[6] Daniel P. Huttenlocher,et al. Beyond trees: common-factor models for 2D human pose recovery , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7] Mark Everingham,et al. Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[8] Alan L. Yuille,et al. Adaptive occlusion state estimation for human pose tracking under self-occlusions , 2013, Pattern Recognit..

[9] Roland Göcke,et al. Regression Based Pose Estimation with Automatic Occlusion Detection and Rectification , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[10] Jitendra Malik,et al. Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11] Yang Wang,et al. Multiple Tree Models for Occlusion and Spatial Constraints in Human Pose Estimation , 2008, ECCV.

[12] Deva Ramanan,et al. Learning to parse images of articulated bodies , 2006, NIPS.

[13] Yuandong Tian,et al. Exploring the Spatial Hierarchy of Mixture Models for Human Pose Estimation , 2012, ECCV.

[14] Vincent Y. F. Tan,et al. Learning Latent Tree Graphical Models , 2010, J. Mach. Learn. Res..

[15] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[16] Daniel P. Huttenlocher,et al. Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[17] Yi Li,et al. Beyond Physical Connections: Tree Models in Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18] Hossein Azizpour,et al. Multi-view Body Part Recognition with Random Forests , 2013, BMVC.

[19] Vittorio Ferrari,et al. Appearance Sharing for Collective Human Pose Estimation , 2012, ACCV.

[20] Jitendra Malik,et al. Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[21] Bernt Schiele,et al. Pictorial structures revisited: People detection and articulated pose estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22] Andrew Zisserman,et al. Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Yi Li,et al. Learning Visual Symbols for Parsing Human Poses in Images , 2013, IJCAI.

[24] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Peter V. Gehler,et al. Poselet Conditioned Pictorial Structures , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Martin A. Fischler,et al. The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[27] Yang Wang,et al. Learning hierarchical poselets for human parsing , 2011, CVPR 2011.

[28] Andrew Zisserman,et al. Pose search: Retrieving people using their pose , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29] Hao Jiang,et al. Global pose estimation using non-tree models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Michael J. Black,et al. Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).