Three-dimensional CNN-inspired deep learning architecture for Yoga pose recognition in the real-world environment

Existing techniques for Yoga pose recognition build classifiers based on sophisticated handcrafted features computed from the raw inputs captured in a controlled environment. These techniques often fail in complex real-world situations and thus, pose limitations on the practical applicability of existing Yoga pose recognition systems. This paper presents an alternative computationally efficient approach for Yoga pose recognition in complex real-world environments using deep learning. To this end, a Yoga pose dataset was created with the participation of 27 individual (8 males and 19 females), which consists of ten Yoga poses, namely Malasana, Ananda Balasana, Janu Sirsasana, Anjaneyasana, Tadasana, Kumbhakasana, Hasta Uttanasana, Paschimottanasana, Uttanasana, and Dandasana. To capture the videos, we used smartphone cameras having 4 K resolution and 30 fps frame rate. For the recognition of Yoga poses in real time, a three-dimensional convolutional neural network (3D CNN) architecture is designed and implemented. The designed architecture is a modified version of the C3D architecture initially introduced for the recognition of human actions. In the proposed modified C3D architecture, the computationally intensive fully connected layers are pruned, and supplementary layers such as the batch normalization and average pooling were introduced for computational efficiency. To the best of our knowledge, this is among the first studies, which utilized the inherent spatial–temporal relationship among Yoga poses for their recognition. The designed 3D CNN architecture achieved test recognition accuracy of 91.15% on the in-house prepared Yoga pose dataset consisting of ten Yoga poses. Furthermore, on the publicly available dataset, the designed architecture achieved competitive test recognition accuracy of 99.39%, along with multifold improvement in the execution speed compared to the existing state-of-the-art technique. To promote further study, we will make the in-house created Yoga pose dataset publicly available to the research community.

[1]  Frank H Wilhelm,et al.  Improving estimation of cardiac vagal tone during spontaneous breathing using a paced breathing calibration. , 2004, Biomedical sciences instrumentation.

[2]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[3]  Peter V. Gehler,et al.  Poselet Conditioned Pictorial Structures , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Noel E. O'Connor,et al.  A virtual coaching environment for improving golf swing technique , 2010, SMVC '10.

[5]  Ben Taskar,et al.  MODEC: Multimodal Decomposable Models for Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Sung-Ah Lim,et al.  Regular Yoga Practice Improves Antioxidant Status, Immune Function, and Stress Hormone Releases in Young Healthy People: A Randomized, Double-Blind, Controlled Pilot Study. , 2015, Journal of alternative and complementary medicine.

[7]  Hans-Peter Seidel,et al.  Markerless motion capture of interacting characters using multi-view image segmentation , 2011, CVPR 2011.

[8]  Rajiv Ranjan Sahay,et al.  Robust Pose Recognition Using Deep Learning , 2016, CVIP.

[9]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Chee Siang Ang,et al.  weSport: Utilising wrist-band sensing to detect player activities in basketball games , 2016, 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).

[11]  M. Schure,et al.  Mind-Body Medicine and the Art of Self-Care: Teaching Mindfulness to Counseling Students through Yoga, Meditation, and Qigong. , 2008 .

[12]  Mark A Williams,et al.  Role of Yoga in Cardiac Disease and Rehabilitation. , 2019, Journal of cardiopulmonary rehabilitation and prevention.

[13]  Ping-Feng Pai,et al.  Analyzing basketball games by a support vector machines with decision tree model , 2017, Neural Computing and Applications.

[14]  Chien-Li Chou,et al.  Yoga Posture Recognition for Self-training , 2014, MMM.

[15]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[16]  M. Micozzi Alternative and complementary medicine: part of human heritage. , 1995, Journal of alternative and complementary medicine.

[17]  Iasonas Kokkinos,et al.  DensePose: Dense Human Pose Estimation in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Yanbing Xue,et al.  Human action recognition on depth dataset , 2015, Neural Computing and Applications.

[19]  Hua-Tsung Chen,et al.  Computer-assisted yoga training system , 2018, Multimedia Tools and Applications.

[20]  M. Waldron,et al.  Movement and physiological match demands of elite rugby league using portable global positioning systems , 2011, Journal of sports sciences.

[21]  Wen Gao,et al.  Robust Estimation of 3D Human Poses from a Single Image , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Cordelia Schmid,et al.  Long-Term Temporal Convolutions for Action Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Chia-Chen Lee,et al.  A Distance Computer Vision Assisted Yoga Learning System , 2011, J. Comput..

[24]  Yi Li,et al.  Beyond Physical Connections: Tree Models in Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Abhishek Gupta,et al.  Real-time Yoga recognition using deep learning , 2019, Neural Computing and Applications.

[26]  Shihao Zhang,et al.  Improved Convolutional Pose Machines for Human Pose Estimation Using Image Sensor Data , 2019, Sensors.

[27]  Cewu Lu,et al.  RMPE: Regional Multi-person Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[29]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Noel E. O'Connor,et al.  Multi-sensor classification of tennis strokes , 2011, 2011 IEEE SENSORS Proceedings.

[31]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Arundhati Navada,et al.  Yoga tutor visualization and analysis using SURF algorithm , 2011, 2011 IEEE Control and System Graduate Research Colloquium.

[34]  G. Sathyanarayanan,et al.  Role of Yoga and Mindfulness in Severe Mental Illnesses: A Narrative Review , 2019, International journal of yoga.

[35]  Luc Van Gool,et al.  Human Pose Estimation Using Body Parts Dependent Joint Regressors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  H. Nagendra,et al.  Yoga improves attention and self-esteem in underprivileged girl student , 2013, Journal of education and health promotion.

[37]  N. Nordsborg,et al.  Estimating Energy Expenditure During Front Crawl Swimming Using Accelerometers , 2014 .

[38]  Henry Been-Lirn Duh,et al.  “Left Arm Up!” Interactive Yoga training in virtual environment , 2011, 2011 IEEE Virtual Reality Conference.

[39]  Yuandong Tian,et al.  Exploring the Spatial Hierarchy of Mixture Models for Human Pose Estimation , 2012, ECCV.

[40]  Che Fai Yeong,et al.  Investigation of upper limb movement during badminton smash , 2015, 2015 10th Asian Control Conference (ASCC).

[41]  Takeo Kanade,et al.  Panoptic Studio: A Massively Multiview System for Social Motion Capture , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).