FollowMeUp Sports: New Benchmark for 2D Human Keypoint Recognition

Human pose estimation has made significant advancement in recent years. However, the existing datasets are limited in their coverage of pose variety. In this paper, we introduce a novel benchmark “FollowMeUp Sports” that makes an important advance in terms of specific postures, self-occlusion and class balance, a contribution that we feel is required for future development in human body models. This comprehensive dataset was collected using an established taxonomy of over 200 standard workout activities with three different shot angles. The collected videos cover a wider variety of specific workout activities than previous datasets including push-up, squat and body moving near the ground with severe self-occlusion or occluded by some sport equipment and outfits. Given these rich images, we perform a detailed analysis of the leading human pose estimation approaches gaining insights for the success and failures of these methods.

[1]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Alexandre Alahi,et al.  PifPaf: Composite Fields for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[6]  Jonathan Tompson,et al.  Towards Accurate Multi-person Pose Estimation in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  David A. Forsyth,et al.  Improved Human Parsing with a Full Relational Model , 2010, ECCV.

[8]  Kyoung Mu Lee,et al.  PoseFix: Model-Agnostic General Human Pose Refinement Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  David Picard,et al.  2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Bernt Schiele,et al.  DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model , 2016, ECCV.

[11]  Yaser Sheikh,et al.  Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[13]  Dongdong Yu,et al.  Multi-Person Pose Estimation With Enhanced Channel-Wise and Spatial Information , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jonathan Tompson,et al.  PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model , 2018, ECCV.

[15]  Mark Everingham,et al.  Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[16]  Gang Yu,et al.  Cascaded Pyramid Network for Multi-person Pose Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Daniil Osokin,et al.  Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose , 2018, ICPRAM.

[18]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[19]  Yang Wang,et al.  Learning hierarchical poselets for human parsing , 2011, CVPR 2011.

[20]  Yichen Wei,et al.  Simple Baselines for Human Pose Estimation and Tracking , 2018, ECCV.

[21]  Yu Tian,et al.  Semantic Graph Convolutional Networks for 3D Human Pose Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Mao Ye,et al.  Fast Human Pose Estimation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Zhiao Huang,et al.  Associative Embedding: End-to-End Learning for Joint Detection and Grouping , 2016, NIPS.

[24]  Vittorio Ferrari,et al.  Better Appearance Models for Pictorial Structures , 2009, BMVC.

[25]  Cewu Lu,et al.  RMPE: Regional Multi-person Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Peter V. Gehler,et al.  DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  David R Bassett,et al.  2011 Compendium of Physical Activities: a second update of codes and MET values. , 2011, Medicine and science in sports and exercise.

[28]  Bernt Schiele,et al.  PoseTrack: A Benchmark for Human Pose Estimation and Tracking , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Iasonas Kokkinos,et al.  DensePose: Dense Human Pose Estimation in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[33]  Xiaogang Wang,et al.  Multi-context Attention for Human Pose Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Xiaogang Wang,et al.  Structured Feature Learning for Pose Estimation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).