Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. Recently, benefited from the deep learning technologies, a significant amount of research efforts have greatly advanced the monocular human pose estimation both in 2D and 3D areas. Although there have been some works to summarize the different approaches, it still remains challenging for researchers to have an in-depth view of how these approaches work. In this paper, we provide a comprehensive and holistic 2D-to-3D perspective to tackle this problem. We categorize the mainstream and milestone approaches since the year 2014 under unified frameworks. By systematically summarizing the differences and connections between these approaches, we further analyze the solutions for challenging cases, such as the lack of data, the inherent ambiguity between 2D and 3D, and the complex multi-person scenarios. We also summarize the pose representation styles, benchmarks, evaluation metrics, and the quantitative performance of popular approaches. Finally, we discuss the challenges and give deep thinking of promising directions for future research. We believe this survey will provide the readers with a deep and insightful understanding of monocular human pose estimation.

[1]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[3]  Ilya Kostrikov,et al.  An Efficient Convolutional Network for Human Pose Estimation , 2016, BMVC.

[4]  Alexandre Alahi,et al.  PifPaf: Composite Fields for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Gerard Pons-Moll,et al.  Human pose estimation from video and inertial sensors , 2015 .

[6]  Cordelia Schmid,et al.  Learning from Synthetic Humans , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Chen Qian,et al.  Multi-Person Articulated Tracking With Spatial and Temporal Embeddings , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jitendra Malik,et al.  Learning 3D Human Dynamics From Video , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Cordelia Schmid,et al.  LCR-Net++: Multi-Person 2D and 3D Pose Detection in Natural Images , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[11]  Hanqing Lu,et al.  Occlusion-Aware Siamese Network for Human Pose Estimation , 2020, ECCV.

[12]  Jordi Gonzàlez,et al.  Human Pose Estimation from Monocular Images: A Comprehensive Survey , 2016, Sensors.

[13]  Liang Lin,et al.  Look into Person: Joint Body Parsing & Pose Estimation Network and a New Benchmark , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Ben Taskar,et al.  MODEC: Multimodal Decomposable Models for Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Kui Jia,et al.  HEMlets Pose: Learning Part-Centric Heatmap Triplets for Accurate 3D Human Pose Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Shuicheng Yan,et al.  Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation , 2018, ECCV.

[17]  Song-Chun Zhu,et al.  DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Ruigang Yang,et al.  Human Pose Estimation with Spatial Contextual Information , 2019, ArXiv.

[19]  Hujun Bao,et al.  SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation , 2020, ECCV.

[20]  Yu Wu,et al.  Pose-Guided Feature Alignment for Occluded Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Fei Yang,et al.  Jointly Optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Jitendra Malik,et al.  Human Pose Estimation with Iterative Error Feedback , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Shuicheng Yan,et al.  Pose Partition Networks for Multi-person Pose Estimation , 2018, ECCV.

[24]  Yichen Wei,et al.  Compositional Human Pose Regression , 2018, Comput. Vis. Image Underst..

[25]  Lourdes Agapito,et al.  Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Zhenhua Wang,et al.  Synthesizing Training Images for Boosting Human 3D Pose Estimation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[27]  Andrew Zisserman,et al.  Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Emre Akbas,et al.  Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Yu Sun,et al.  CenterHMR: a Bottom-up Single-shot Method for Multi-person 3D Mesh Recovery from a Single Image , 2020, ArXiv.

[30]  James J. Little,et al.  Exploiting Temporal Information for 3D Human Pose Estimation , 2017, ECCV.

[31]  Xiaogang Wang,et al.  Learning Feature Pyramids for Human Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Cewu Lu,et al.  Pairwise Body-Part Attention for Recognizing Human-Object Interactions , 2018, ECCV.

[33]  Sergio Escalera,et al.  Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation , 2019, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).

[34]  Chunluan Zhou,et al.  Temporal Keypoint Matching and Refinement Network for Pose Estimation and Tracking , 2020, ECCV.

[35]  Shuicheng Yan,et al.  Human Pose Estimation with Parsing Induced Learner , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  B. Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[38]  Wanggen Wan,et al.  A Review of Human Pose Estimation from Single Image , 2018, 2018 International Conference on Audio, Language and Image Processing (ICALIP).

[39]  Peter V. Gehler,et al.  DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Juergen Gall,et al.  PoseTrack: Joint Multi-person Pose Estimation and Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Ruigang Yang,et al.  Detailed Human Shape Estimation From a Single Image by Hierarchical Mesh Deformation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Feiyue Huang,et al.  Adversarial Semantic Data Augmentation for Human Pose Estimation , 2020, ECCV.

[43]  Kwang-Ting Cheng,et al.  Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Siddharth Seth,et al.  Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[46]  Ardhendu Behera,et al.  Adapting MobileNets for mobile based upper body pose estimation , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[47]  Cristian Sminchisescu,et al.  Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes: The Importance of Multiple Scene Constraints , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Bingbing Ni,et al.  Deep Kinematics Analysis for Monocular 3D Human Pose Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Weiyu Zhang,et al.  From Actemes to Action: A Strongly-Supervised Representation for Detailed Action Understanding , 2013, 2013 IEEE International Conference on Computer Vision.

[50]  Iasonas Kokkinos,et al.  DensePose: Dense Human Pose Estimation in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Xiangyu Zhang,et al.  Learning Delicate Local Representations for Multi-Person Pose Estimation , 2020, ECCV.

[52]  Kyoung Mu Lee,et al.  Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose , 2020, ECCV.

[53]  James M. Rehg,et al.  Unsupervised 3D Pose Estimation With Geometric Self-Supervision , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[55]  Hans-Peter Seidel,et al.  VNect , 2017, ACM Trans. Graph..

[56]  Chun Chen,et al.  A survey of human pose estimation: The body parts parsing based methods , 2015, J. Vis. Commun. Image Represent..

[57]  Cewu Lu,et al.  Transferable Interactiveness Knowledge for Human-Object Interaction Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Ying Wu,et al.  Deeply Learned Compositional Models for Human Pose Estimation , 2018, ECCV.

[59]  Mohan M. Trivedi,et al.  Human Pose Estimation and Activity Recognition From Multi-View Videos: Comparative Explorations of Recent Developments , 2012, IEEE Journal of Selected Topics in Signal Processing.

[60]  Kyoung Mu Lee,et al.  Camera Distance-Aware Top-Down Approach for 3D Multi-Person Pose Estimation From a Single RGB Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[61]  Peiyun Hu,et al.  Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Junqi Liu,et al.  Detailed 2D-3D Joint Representation for Human-Object Interaction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Dimitrios Tzionas,et al.  Embodied Hands: Modeling and Capturing Hands and Bodies Together , 2022, ArXiv.

[65]  Kyoung Mu Lee,et al.  PoseFix: Model-Agnostic General Human Pose Refinement Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Bodo Rosenhahn,et al.  Supplementary Material to: Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera , 2018 .

[67]  Dimitris N. Metaxas,et al.  Quantized Densely Connected U-Nets for Efficient Landmark Localization , 2018, ECCV.

[68]  Shiliang Zhang,et al.  Pose-Driven Deep Convolutional Model for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[69]  Jonathan Tompson,et al.  Efficient object localization using Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Davide Modolo,et al.  Combining Detection and Tracking for Human Pose Estimation in Videos , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Jian Dong,et al.  Towards Unified Human Parsing and Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Deva Ramanan,et al.  3D Human Pose Estimation = 2D Pose Estimation + Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Kyoung Mu Lee,et al.  I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image , 2020, ECCV.

[74]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[75]  Antoni B. Chan,et al.  3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network , 2014, ACCV.

[76]  Michael J. Black,et al.  Puppet Flow , 2013 .

[77]  Andrew Zisserman,et al.  Personalizing Human Video Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Nikolaus F. Troje,et al.  MoVi: A large multi-purpose human motion and video dataset , 2020, PloS one.

[79]  Gang Yu,et al.  Rethinking on Multi-Stage Networks for Human Pose Estimation , 2019, ArXiv.

[80]  David Picard,et al.  Human Pose Regression by Combining Indirect Part Detection and Contextual Information , 2017, Comput. Graph..

[81]  Marc Pollefeys,et al.  H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[82]  Pascal Fua,et al.  Learning Monocular 3D Human Pose Estimation from Multi-view Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[83]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  Yalew Zelalem Jembre,et al.  The Progress of Human Pose Estimation: A Survey and Taxonomy of Models Applied in 2D Human Pose Estimation , 2020, IEEE Access.

[85]  Bernt Schiele,et al.  ArtTrack: Articulated Multi-Person Tracking in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86]  Nojun Kwak,et al.  Pose estimator and tracker using temporal flow maps for limbs , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[87]  Lorenzo Torresani,et al.  Detect-and-Track: Efficient Pose Estimation in Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[88]  Asim Kadav,et al.  15 Keypoints Is All You Need , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[89]  Kostas Daniilidis,et al.  TexturePose: Supervising Human Mesh Estimation With Texture Consistency , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[90]  Hwann-Tzong Chen,et al.  Self Adversarial Training for Human Pose Estimation , 2017, 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[91]  Wu Liu,et al.  Pose-Guided Tracking-by-Detection: Robust Multi-Person Pose Tracking , 2021, IEEE Transactions on Multimedia.

[92]  Sanghoon Lee,et al.  Propagating LSTM: 3D Pose Estimation Based on Joint Interdependency , 2018, ECCV.

[93]  Yu Tian,et al.  Semantic Graph Convolutional Networks for 3D Human Pose Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Zhiao Huang,et al.  Associative Embedding: End-to-End Learning for Joint Detection and Grouping , 2016, NIPS.

[95]  Michael J. Black,et al.  The stitched puppet: A graphical model of 3D human shape and pose , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[96]  Michael J. Black,et al.  Learning to Reconstruct 3D Human Pose and Shape via Model-Fitting in the Loop , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[97]  Arne Schumann,et al.  Human Pose Estimation for Real-World Crowded Scenarios , 2019, 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[98]  Juergen Gall,et al.  Pose for Action - Action for Pose , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[99]  Catherine Achard,et al.  PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[100]  Jing Xu,et al.  Attention-Aware Compositional Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[101]  Luc Van Gool,et al.  Thin-Slicing Network: A Deep Structured Model for Pose Estimation in Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[102]  Nikolaus F. Troje,et al.  AMASS: Archive of Motion Capture As Surface Shapes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[103]  Xiaogang Wang,et al.  Multi-context Attention for Human Pose Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[104]  Andrew Zisserman,et al.  Flowing ConvNets for Human Pose Estimation in Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[105]  Abhishek Sharma,et al.  Learning 3D Human Pose from Structure and Motion , 2017, ECCV.

[106]  Andreas Doering JointFlow: Temporal Flow Fields for Multi Person Pose Tracking , 2021 .

[107]  Guan Huang,et al.  The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[108]  Ruixu Liu,et al.  Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[109]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[110]  Takeo Kanade,et al.  Panoptic Studio: A Massively Multiview System for Social Motion Capture , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[111]  Nicu Sebe,et al.  Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events , 2020, ArXiv.

[112]  Guanghan Ning,et al.  LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[113]  Dong Liu,et al.  Human Pose Estimation Using Global and Local Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[114]  Pascal Fua,et al.  Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision , 2016, 2017 International Conference on 3D Vision (3DV).

[115]  Emre Akbas,et al.  MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network , 2018, ECCV.

[116]  Kaiqi Huang,et al.  ORGM: Occlusion Relational Graphical Model for Human Pose Estimation , 2017, IEEE Transactions on Image Processing.

[117]  Yu Qiao,et al.  RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[118]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[119]  Mao Ye,et al.  Fast Human Pose Estimation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[120]  Xiaowei Zhou,et al.  Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[121]  Jia Li,et al.  Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation , 2019, AAAI.

[122]  Yaser Sheikh,et al.  Single-Network Whole-Body Pose Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[123]  Xiaogang Wang,et al.  Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[124]  Yaser Sheikh,et al.  Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[125]  Peter V. Gehler,et al.  Unite the People: Closing the Loop Between 3D and 2D Human Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[126]  C. Qian,et al.  HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation , 2020, ECCV.

[127]  Dario Pavllo,et al.  3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[128]  Iasonas Kokkinos,et al.  HoloPose: Holistic 3D Human Reconstruction In-The-Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[129]  Xu Chen,et al.  Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[130]  Jin Xu,et al.  Whole-Body Human Pose Estimation in the Wild , 2020, ECCV.

[131]  Jonathan Tompson,et al.  Towards Accurate Multi-person Pose Estimation in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[132]  Xuming He,et al.  Pose-Aware Multi-Level Feature Network for Human Object Interaction Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[133]  Hui Cheng,et al.  Recurrent 3D Pose Sequence Machines , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[134]  Simone Calderara,et al.  Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[135]  Andreas Savakis,et al.  UniPose: Unified Human Pose Estimation in Single Images and Videos , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[136]  Liang Zheng,et al.  Correlating Edge, Pose With Parsing , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[137]  Mark Everingham,et al.  Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[138]  Dacheng Tao,et al.  A Coarse-Fine Network for Keypoint Localization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[139]  Andrew Zisserman,et al.  Recurrent Human Pose Estimation , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[140]  Xiu-Shen Wei,et al.  Adversarial PoseNet: A Structure-Aware Convolutional Network for Human Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[141]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[142]  Song-Chun Zhu,et al.  Joint action recognition and pose estimation from video , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[143]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[144]  Saurabh Sharma,et al.  Monocular 3D Human Pose Estimation by Generation and Ordinal Ranking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[145]  Jun Hong,et al.  Pose-native Network Architecture Search for Multi-person Human Pose Estimation , 2020, ACM Multimedia.

[146]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[147]  Xiaowei Zhou,et al.  Coherent Reconstruction of Multiple Humans From a Single Image , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[148]  Cristian Sminchisescu,et al.  Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images , 2018, NeurIPS.

[149]  Cordelia Schmid,et al.  Towards Understanding Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[150]  Song-Chun Zhu,et al.  Monocular 3D Human Pose Estimation by Predicting Depth on Joints , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[151]  Dongdong Yu,et al.  Multi-Person Pose Estimation With Enhanced Channel-Wise and Spatial Information , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[152]  Kostas Daniilidis,et al.  Convolutional Mesh Regression for Single-Image Human Shape Reconstruction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[153]  Christian Theobalt,et al.  Single-Shot Multi-person 3D Pose Estimation from Monocular RGB , 2017, 2018 International Conference on 3D Vision (3DV).

[154]  Takeo Kanade,et al.  Panoptic Studio: A Massively Multiview System for Social Interaction Capture , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[155]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[156]  Ying Wu,et al.  Does Learning Specific Features for Related Parts Help Human Pose Estimation? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[157]  Xiaowei Zhou,et al.  Ordinal Depth Supervision for 3D Human Pose Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[158]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[159]  Michael J. Black,et al.  VIBE: Video Inference for Human Body Pose and Shape Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[160]  Alan L. Yuille,et al.  OriNet: A Fully Convolutional Network for 3D Human Pose Estimation , 2018, BMVC.

[161]  Yaser Sheikh,et al.  Monocular Total Capture: Posing Face, Body, and Hands in the Wild , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[162]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[163]  Georgios Tzimiropoulos,et al.  Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[164]  Andrea Palazzi,et al.  Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World , 2018, ECCV.

[165]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[166]  Yichen Wei,et al.  Integral Human Pose Regression , 2017, ECCV.

[167]  Haoyu Wang,et al.  Pose Flow: Efficient Online Pose Tracking , 2018, BMVC.

[168]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[169]  Xiaowei Zhou,et al.  Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[170]  Dragomir Anguelov,et al.  SCAPE: shape completion and animation of people , 2005, ACM Trans. Graph..

[171]  Jonathan Tompson,et al.  PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model , 2018, ECCV.

[172]  Dimitrios Tzionas,et al.  Resolving 3D Human Pose Ambiguities With 3D Scene Constraints , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[173]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[174]  Tao Mei,et al.  POINet: Pose-Guided Ovonic Insight Network for Multi-Person Pose Tracking , 2019, ACM Multimedia.

[175]  David Picard,et al.  2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[176]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[177]  Huchuan Lu,et al.  Pose-Invariant Embedding for Deep Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[178]  Shimon Ullman,et al.  Human Pose Estimation Using Deep Consensus Voting , 2016, ECCV.

[179]  HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[180]  Bo Wang,et al.  Occlusion-Aware Networks for 3D Human Pose Estimation in Video , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[181]  Jiashi Feng,et al.  Dynamic Kernel Distillation for Efficient Pose Estimation in Videos , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[182]  Xiaogang Wang,et al.  Structured Feature Learning for Pose Estimation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[183]  Dimitrios Tzionas,et al.  Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[184]  Yichen Wei,et al.  Simple Baselines for Human Pose Estimation and Tracking , 2018, ECCV.

[185]  Cordelia Schmid,et al.  MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild , 2016, NIPS.

[186]  Kang Zheng,et al.  Combining local appearance and holistic view: Dual-Source Deep Neural Networks for human pose estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[187]  Navdeep Jaitly,et al.  Chained Predictions Using Convolutional Neural Networks , 2016, ECCV.

[188]  Yifan Zhang,et al.  Skeleton-Based Action Recognition With Multi-Stream Adaptive Graph Convolutional Networks , 2019, IEEE Transactions on Image Processing.

[189]  Juergen Gall,et al.  Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos , 2020, ECCV.

[190]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[191]  Bernt Schiele,et al.  Learning to Refine Human Pose Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[192]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[193]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[194]  Vincent Lepetit,et al.  Direct Prediction of 3D Body Poses from Motion Compensated Sequences , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[195]  Shuicheng Yan,et al.  Hierarchical Contextual Refinement Networks for Human Pose Estimation , 2019, IEEE Transactions on Image Processing.

[196]  Bernt Schiele,et al.  PoseTrack: A Benchmark for Human Pose Estimation and Tracking , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[197]  Deva Ramanan,et al.  Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild , 2020, ECCV.

[198]  Ioannis A. Kakadiaris,et al.  3D Human pose estimation: A review of the literature and analysis of covariates , 2016, Comput. Vis. Image Underst..

[199]  Jianbo Liu,et al.  LSTM Pose Machines , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[200]  Shang Gao,et al.  Pose-Guided Visible Part Matching for Occluded Person ReID , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[201]  Yingli Tian,et al.  Monocular human pose estimation: A survey of deep learning-based methods , 2020, Comput. Vis. Image Underst..

[202]  Pascal Fua,et al.  ActiveMoCap: Optimized Viewpoint Selection for Active Human Motion Capture , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[203]  Bo Zhao,et al.  AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding , 2017, ArXiv.

[204]  M. Pollefeys,et al.  Unified Egocentric Recognition of 3 D Hand-Object Poses and Interactions , 2019 .

[205]  James J. Little,et al.  A Simple Yet Effective Baseline for 3d Human Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[206]  Cordelia Schmid,et al.  P-CNN: Pose-Based CNN Features for Action Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[207]  Peng Wang,et al.  Joint Multi-person Pose Estimation and Semantic Part Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[208]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[209]  Gang Yu,et al.  Cascaded Pyramid Network for Multi-person Pose Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[210]  Xiaogang Wang,et al.  Multi-source Deep Learning for Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[211]  Pavlo Molchanov,et al.  Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[212]  Alan L. Yuille,et al.  Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations , 2014, NIPS.

[213]  Lizhuang Ma,et al.  EllipBody: A Light-weight and Part-based Representation for Human Pose and Shape Recovery , 2020, ArXiv.

[214]  Jian Wang,et al.  Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement , 2020, ECCV.

[215]  Bernt Schiele,et al.  DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model , 2016, ECCV.

[216]  Hao Zhu,et al.  CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[217]  Pascal Fua,et al.  Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation , 2018, ECCV.

[218]  Cewu Lu,et al.  RMPE: Regional Multi-person Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[219]  Andrew Zisserman,et al.  Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos , 2014, ACCV.

[220]  Cristian Sminchisescu,et al.  Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[221]  Song-Chun Zhu,et al.  Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation , 2017, AAAI.

[222]  Gang Wang,et al.  Feature Boosting Network For 3D Pose Estimation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[223]  Yaser Sheikh,et al.  Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[224]  Liu Wu,et al.  Human Mesh Recovery From Monocular Images via a Skeleton-Disentangled Representation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[225]  Mao Ye,et al.  Distribution-Aware Coordinate Representation for Human Pose Estimation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).