AHA-3D: A Labelled Dataset for Senior Fitness Exercise Recognition and Segmentation from 3D Skeletal Data

Automated assessment of fitness exercises has important applications in computer and robot-based exercise coaches to deploy at home, gymnasiums or care centers. In this work, we introduce AHA-3D, a labeled dataset of sequences of 3D skeletal data depicting standard fitness tests on young and elderly subjects, for the purpose of automatic fitness exercises assessment. To the best of our knowledge, AHA-3D is the first publicly available dataset featuring multi-generational, male and female subjects, with frame-level labels, allowing for action segmentation as well as the estimation of metrics like risk of fall, and autonomy to perform daily tasks. We present two baseline methods for recognition and one for segmentation. For recognition, we trained models on the positions of the joints achieving 88.2%± 0.077 accuracy, and on joint positions and velocities, achieving 91%± 0.082 accuracy. Using the Kolmogorov-Smirnov test we determined the model trained on velocities was superior. The segmentation baseline achieved an accuracy of 88.29% in detecting actions at frame level. Our results show promising recognition and detection performance suggesting AHA3D’s potential use in practical applications like exercise performance and correction, elderly fitness level estimation and risk of falling for elders.

[1]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[2]  B. Springer,et al.  Normative Values for the Unipedal Stance Test with Eyes Open and Closed , 2007, Journal of geriatric physical therapy.

[3]  Alexandre Bernardino,et al.  A dataset for the automatic assessment of functional senior fitness tests using kinect and physiological sensors , 2016, 2016 1st International Conference on Technology and Innovation in Sports, Health and Wellbeing (TISHW).

[4]  J. Fries,et al.  Reduced disability and mortality among aging runners: a 21-year longitudinal study. , 2008, Archives of internal medicine.

[5]  Wei Li,et al.  CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016 , 2016, ArXiv.

[6]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[7]  Larry S. Davis,et al.  Temporal Context Network for Activity Localization in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Xu Zhao,et al.  Temporal Convolution Based Action Proposal: Submission to ActivityNet 2017 , 2017, ArXiv.

[9]  Aytül Erçil,et al.  A decision forest based feature selection framework for action recognition from RGB-depth cameras , 2013, 2013 21st Signal Processing and Communications Applications Conference (SIU).

[10]  Javier Sánchez Pérez,et al.  TV-L1 Optical Flow Estimation , 2013, Image Process. Line.

[11]  Andrew Zisserman,et al.  Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Debra J. Rose,et al.  Physical Activity Instruction of Older Adults , 2004 .

[13]  Diana Baader,et al.  Senior Fitness Test Manual , 2016 .

[14]  J Michael Gaziano,et al.  Exceptional longevity in men: modifiable factors associated with survival and function to age 90 years. , 2008, Archives of internal medicine.

[15]  Limin Wang,et al.  Temporal Action Detection with Structured Segment Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Misha Pavel,et al.  Design and Evaluation of an Interactive Exercise Coaching System for Older Adults: Lessons Learned , 2016, IEEE Journal of Biomedical and Health Informatics.

[17]  Hugo Fuks,et al.  Qualitative activity recognition of weight lifting exercises , 2013, AH.

[18]  Diane K. King,et al.  Physical Activity for an Aging Population , 2010 .

[19]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.