Joint-oriented Features for Skeleton-based Action Recognition

In this paper, we propose joint-oriented features for skeleton-based action recognition, which aims to decrease the influence of ambiguous joints in the skeleton sequences. When doing skeleton-based action recognition, we noticed that if the skeleton data contains noisy joints, the result would be influenced by the noise. Since the selections of disambiguous joints might be impossible, it would be a hard task for humans to distinguish whether the joint in the frame is correct. To deal with this situation, we propose joint-oriented features to train joint-oriented models. If some joints are noise in the frame, the corresponding joint-oriented models would not perform well on the case. As we could not preFigure the noise in the data, we apply ensemble modeling with our joint-oriented models to let the disambiguous joints correct the ambiguous ones. To demonstrate the effectiveness of our proposed method, we conducted the experiments on three benchmark skeleton-based action datasets, including the large-scale challenging NTU-RGBD, and our approach achieves competitive performance over the state-of-the-art.

[1]  Guoliang Lu,et al.  Extraction of action patterns using local temporal self-similarities of skeletal body-joints , 2013, 2013 6th International Congress on Image and Signal Processing (CISP).

[2]  Nanning Zheng,et al.  View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Mohan M. Trivedi,et al.  Joint Angles Similarities and HOG2 for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Dehui Kong,et al.  Effective human action recognition using global and local offsets of skeleton joints , 2018, Multimedia Tools and Applications.

[6]  Wenjun Zeng,et al.  An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.

[7]  Yi Yang,et al.  Learning a 3D Human Pose Distance Metric from Geometric Pose Descriptor , 2011, IEEE Transactions on Visualization and Computer Graphics.

[8]  Alexandros André Chaaraoui,et al.  Fusion of Skeletal and Silhouette-Based Features for Human Action Recognition with RGB-D Devices , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[9]  Marco La Cascia,et al.  3D skeleton-based human action classification: A survey , 2016, Pattern Recognit..

[10]  Yansong Tang,et al.  Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Xiaodong Yang,et al.  Effective 3D action recognition using EigenJoints , 2014, J. Vis. Commun. Image Represent..

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[15]  Hanseok Ko,et al.  Hidden Markov Model on a unit hypersphere space for gesture trajectory recognition , 2014, Pattern Recognit. Lett..

[16]  Georgios Evangelidis,et al.  Skeletal Quads: Human Action Recognition Using Joint Quadruples , 2014, 2014 22nd International Conference on Pattern Recognition.

[17]  Anoop Cherian,et al.  Tensor Representations via Kernel Linearization for Action Recognition from 3D Skeletons , 2016, ECCV.

[18]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[19]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[20]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Mario Fernando Montenegro Campos,et al.  Online gesture recognition from pose kernel learning and decision forests , 2014, Pattern Recognit. Lett..

[22]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[23]  Gang Wang,et al.  NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Yong Du,et al.  Skeleton based action recognition with convolutional neural network , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[25]  Jian Yang,et al.  Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition , 2018, AAAI.

[26]  Mohammed Bennamoun,et al.  A New Representation of Skeleton Sequences for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Guo-Jun Qi,et al.  Differential Recurrent Neural Networks for Action Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Wei-Shi Zheng,et al.  Jointly Learning Heterogeneous Features for RGB-D Activity Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Marco La Cascia,et al.  Hankelet-based dynamical systems modeling for 3D action recognition , 2015, Image Vis. Comput..

[31]  Joseph J. LaViola,et al.  Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition , 2013, International Journal of Computer Vision.

[32]  Anuj Srivastava,et al.  Accurate 3D action recognition using learning on the Grassmann manifold , 2015, Pattern Recognit..

[33]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Ruzena Bajcsy,et al.  Sequence of the Most Informative Joints (SMIJ): A new representation for human skeletal action recognition , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[35]  Xiaoming Liu,et al.  On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).