JT-MGCN: Joint-temporal Motion Graph Convolutional Network for Skeleton-Based Action Recognition

Recently, action recognition methods using graph convolutional networks (GCN) have shown remarkable performance thanks to its concise but effective representation of human body motion. Prior methods construct human body motion graph building edges between neighbor or distant body joints. On the other hand, human action contains lots of temporal variations showing strong temporal correlations between joint motions. Thus the characterization of an action requires a comprehensive analysis of joint motion correlations on spatial and temporal domains. In this paper, we propose Joint-temporal Motion Graph Convolutional Networks (JT-MGCN) in which joint-temporal edges learn the correlations between different joints at different time. Experimental evaluation on large public data sets such as NTU rgb+d data set and kinetics-skeleton data set show outstanding action recognition performance.

[1]  Wenjun Zeng,et al.  An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.

[2]  Lin Li,et al.  Skeleton-Based Relational Modeling for Action Recognition , 2018, ArXiv.

[3]  Qiang Ji,et al.  Bayesian Graph Convolution LSTM for Skeleton Based Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Tieniu Tan,et al.  Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning , 2018, ECCV.

[6]  Yansong Tang,et al.  Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Junsong Yuan,et al.  Discriminative Spatio-Temporal Pattern Discovery for 3D Action Recognition , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Yi Lin,et al.  Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN , 2017, 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[9]  Pichao Wang,et al.  Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks , 2016, ACM Multimedia.

[10]  Fabio Viola,et al.  The Kinetics Human Action Video Dataset , 2017, ArXiv.

[11]  Xiaohui Xie,et al.  Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks , 2016, AAAI.

[12]  Gang Wang,et al.  NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Lei Shi,et al.  Skeleton-Based Action Recognition With Directed Graph Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[15]  Seungkyu Lee,et al.  Motion regeneration using motion texture and autoencoder , 2018, SIGGRAPH ASIA Posters.

[16]  Gang Wang,et al.  Global Context-Aware Attention LSTM Networks for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Tieniu Tan,et al.  An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Hong Liu,et al.  Enhanced skeleton visualization for view invariant human action recognition , 2017, Pattern Recognit..

[20]  Sergio Escalera,et al.  RGB-D-based Human Motion Recognition with Deep Learning: A Survey , 2017, Comput. Vis. Image Underst..

[21]  Mohammed Bennamoun,et al.  Learning Clip Representations for Skeleton-Based 3D Action Recognition , 2018, IEEE Transactions on Image Processing.

[22]  Lei Shi,et al.  Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Shuai Li,et al.  Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Chao Li,et al.  Skeleton-based action recognition with convolutional neural networks , 2017, 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[25]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[26]  Nanning Zheng,et al.  View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Li Fei-Fei,et al.  Unsupervised Learning of Long-Term Motion Dynamics for Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Austin Reiter,et al.  Interpretable 3D Human Action Analysis with Temporal Convolutional Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Mohammed Bennamoun,et al.  A New Representation of Skeleton Sequences for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Hong Liu,et al.  Two-Stream 3D Convolutional Neural Network for Skeleton-Based Action Recognition , 2017, ArXiv.