Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition

Graph convolutions networks (GCN) have drawn attention for skeleton-based action recognition because a skeleton with joints and bones can be naturally regarded as a graph structure. However, the existing methods are limited in temporal sequence modeling of human actions. To consider temporal factors in action modeling, we present a novel Temporal-Aware Graph Convolution Network (TA-GCN). First, we design a causal temporal convolution (CTCN) layer to ensure no impractical future information leakage to the past. Second, we present a novel cross-spatial-temporal graph convolution (3D-GCN) layer that extends an adaptive graph from the spatial to the temporal domain to capture local cross-spatial-temporal dependencies among joints. Involving the two temporal factors, TA-GCN can model the sequential nature of human actions. Experimental results on two large-scale datasets, NTU-RGB+D and Kinetics-Skeleton, indicate that our network achieves accuracy improvement (about 1% on the two datasets) over previous methods.

[1]  Jian Yang,et al.  Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition , 2018, AAAI.

[2]  Yansong Tang,et al.  Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Naveen K. Chilamkurti,et al.  A Novel Deep-Learning-Based Bug Severity Classification Technique Using Convolutional Neural Networks and Random Forest with Boosting , 2019, Sensors.

[4]  Fei Wu,et al.  Spatio-Temporal Graph Routing for Skeleton-Based Action Recognition , 2019, AAAI.

[5]  Gang Wang,et al.  NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Xu Chen,et al.  Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Matteo Matteucci,et al.  Spatial Temporal Transformer Network for Skeleton-based Action Recognition , 2020, ICPR Workshops.

[9]  Lei Shi,et al.  Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  P. J. Narayanan,et al.  Part-based Graph Convolutional Network for Action Recognition , 2018, BMVC.

[12]  Hanqing Lu,et al.  Skeleton-Based Action Recognition With Gated Convolutional Neural Networks , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[14]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[15]  Fabio Viola,et al.  The Kinetics Human Action Video Dataset , 2017, ArXiv.

[16]  Austin Reiter,et al.  Interpretable 3D Human Action Analysis with Temporal Convolutional Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Akshi Kumar,et al.  Machine Learning from Theory to Algorithms: An Overview , 2018, Journal of Physics: Conference Series.