Robust Human Pose Estimation for Rotation via Self-Supervised Learning

The detection of abnormal postures, such as that of a reclining person, is a crucial part of visual surveillance. Further, even regular poses can appear rotated because of incongruity between the image and the angle of a pre-installed camera. However, most existing human pose estimation methods focus on small rotational changes, i.e., those less than 50 degrees, and they seldom consider robust human pose estimation for more drastic rotational changes. To the best of our knowledge, there have been no reports on the robustness of human pose estimation for rotational changes through large angles. In this study, we propose a robust human pose estimation method by creating a path for learning new rotational changes based on a self-supervised method and by combining the results with those obtained from a path based on a supervised method. Furthermore, a combination module composed of a convolutional layer is trained complementarily by both paths of the network to produce robust results for various rotational changes. We demonstrate the robustness of the proposed method with extensive experiments on images generated by rotating the elements of standard benchmark datasets. We fully analyze the rotational characteristics of the state-of-the-art human pose estimators and the proposed method. On the COCO Keypoint Detection dataset, the proposed method attains more than 15% improvement in the mean of average precision compared to the state-of-the-art method, and the standard deviation of the performance is improved by more than 4.7 times.

[1]  Yichen Wei,et al.  Simple Baselines for Human Pose Estimation and Tracking , 2018, ECCV.

[2]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  David Picard,et al.  2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Cewu Lu,et al.  RMPE: Regional Multi-person Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[7]  Lin Gao,et al.  A Survey on Human Performance Capture and Animation , 2017, Journal of Computer Science and Technology.

[8]  Kyoung Mu Lee,et al.  PoseFix: Model-Agnostic General Human Pose Refinement Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[10]  Emanuele Menegatti,et al.  Fast and robust detection of fallen people from a mobile robot , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Andrew Zisserman,et al.  2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images , 2012, International Journal of Computer Vision.

[12]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Jiebo Luo,et al.  AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations Rather Than Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ying Wu,et al.  Deeply Learned Compositional Models for Human Pose Estimation , 2018, ECCV.

[16]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[17]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Xiaogang Wang,et al.  Learning Feature Pyramids for Human Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[21]  Yongjin Kwon,et al.  Vision‐based garbage dumping action detection for real‐world surveillance platform , 2019, ETRI Journal.

[22]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  J. Crowley,et al.  CAVIAR Context Aware Vision using Image-based Active Recognition , 2005 .

[26]  Andrew S. Glassner,et al.  Introduction to computer graphics , 2013, SIGGRAPH '13.

[27]  Hakil Kim,et al.  Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNN , 2017, ArXiv.

[28]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Taiki Sekii,et al.  Pose Proposal Networks , 2018, ECCV.

[30]  Honggang Qi,et al.  Multi-Scale Structure-Aware Network for Human Pose Estimation , 2018, ECCV.

[31]  Gang Yu,et al.  Cascaded Pyramid Network for Multi-person Pose Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[33]  Jin Young Choi,et al.  Skeleton-Based Action Recognition of People Handling Objects , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[34]  Dacheng Tao,et al.  Self-Supervised Representation Learning by Rotation Feature Decoupling , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  A. Bowman,et al.  Applied smoothing techniques for data analysis : the kernel approach with S-plus illustrations , 1999 .