Analyzing Role of Joint Subset Selection in Human Action Recognition

Joints of human pose play an important role in human action recognition (HAR) due to their invariant to subject appearance, low requirement on computation and storage, as well as containing informative data. Psychologists [1] showed that human actions (e.g., walking, running, dancing) could be distinguished by using only a subset of joints rather than using full-joint model. In this paper, we tackle role of joint selection in HAR performance by proposing a pre-determined configuration of joints combined with temporal derivatives of joint positions. To analyze role of joint selection, our proposed scheme is compared with full-joint as well as automatic joint selection schemes. Proposed method is evaluated on three public datasets: MSR-Action3D, UTKinect-Action and Florence3D-Action. Experiments show that our proposed method achieves very competitive results compared with state-of-the-art methods. For MSR-Action3D dataset, our proposed method achieves better result than existing methods with accuracy of up to 95.53%. For Florence3D-Action and UTKinect-Action datasets, accuracy of our proposed method ranks in the second position while computation is 10-20 times faster than top-ranking method.

[1]  Stéphane Lecoeuche,et al.  3D real-time human action recognition using a spline interpolation approach , 2015, 2015 International Conference on Image Processing Theory, Tools and Applications (IPTA).

[2]  Qifei Wang,et al.  A Survey of Visual Analysis of Human Motion and Its Applications , 2016, ArXiv.

[3]  Mehrtash Tafazzoli Harandi,et al.  Going deeper into action recognition: A survey , 2016, Image Vis. Comput..

[4]  Austin Reiter,et al.  Interpretable 3D Human Action Analysis with Temporal Convolutional Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[6]  Ruzena Bajcsy,et al.  Sequence of the Most Informative Joints (SMIJ): A new representation for human skeletal action recognition , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Thi-Lan Le,et al.  Novel Skeleton-based Action Recognition Using Covariance Descriptors on Most Informative Joints , 2018, 2018 10th International Conference on Knowledge and Systems Engineering (KSE).

[9]  Yun Fu,et al.  Human Action Recognition and Prediction: A Survey , 2018, International Journal of Computer Vision.

[10]  Gang Wang,et al.  Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks , 2017, IEEE Transactions on Image Processing.

[11]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[13]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[14]  Alberto Del Bimbo,et al.  Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[15]  Daijin Kim,et al.  Robust human activity recognition from depth video using spatiotemporal multi-fused features , 2017, Pattern Recognit..

[16]  Luc Van Gool,et al.  Deep Learning on Lie Groups for Skeleton-Based Action Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[18]  Jing Zhang,et al.  RGB-D-based action recognition datasets: A survey , 2016, Pattern Recognit..

[19]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[20]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.