Multi-scale skeleton adaptive weighted GCN for skeleton-based human action recognition in IoT

Abstract Skeleton-based human action recognition has become a hot topic due to its potential advantages. Graph convolution network (GCN) has obtained remarkable performances in the modeling of skeleton-based human action recognition in IoT. In order to capture robust spatial–temporal features from the human skeleton, a powerful feature extractor is essential. However, Most GCN-based methods use the fixed graph topology. Besides, only a single-scale feature is used, and the multi-scale information is ignored. In this paper, we propose a multi-scale skeleton adaptive weighted graph convolution network (MS-AWGCN) for skeleton-based action recognition. Specifically, a multi-scale skeleton graph convolution network is adopted to extract more abundant spatial features of skeletons. Moreover, we develop a simple graph vertex fusion strategy, which can learn the latent graph topology adaptively by replacing the handcrafted adjacency matrix with a learnable matrix. According to different sampling strategies, weighted learning method is adopted to enrich features while aggregating. Experiments on three large datasets illustrate that the proposed method achieves comparable performances to state-of-the-art methods. Our proposed method attains an improvement of 0.9% and 0.7% respectively over the recent GCN-based method on the NTU RGB+D and Kinetics dataset.

[1]  Yifan Zhang,et al.  Skeleton-Based Action Recognition With Multi-Stream Adaptive Graph Convolutional Networks , 2019, IEEE Transactions on Image Processing.

[2]  Jun Wang,et al.  A Complex-Valued Projection Neural Network for Constrained Optimization of Real Functions in Complex Variables , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Mark R. Wilson,et al.  Virtually the same? How impaired sensory information in virtual reality may disrupt vision for action , 2019, Experimental Brain Research.

[4]  Qi Tian,et al.  Symbiotic Graph Neural Networks for 3D Skeleton-Based Human Action Recognition and Motion Prediction , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yuanlong Yu,et al.  Sparse coding extreme learning machine for classification , 2017, Neurocomputing.

[6]  Zhengyou Zhang,et al.  Microsoft Kinect Sensor and Its Effect , 2012, IEEE Multim..

[7]  Min Yang,et al.  A graph convolutional neural network for classification of building patterns using spatial vector data , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[8]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[9]  Guolong Chen,et al.  Learning from context: A mutual reinforcement model for Chinese microblog opinion retrieval , 2018, Frontiers of Computer Science.

[10]  Jianfei Yang,et al.  Exploiting Inter-Frame Regional Correlation for Efficient Action Recognition , 2020, ArXiv.

[11]  Qilong Wang,et al.  ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  Wenzhong Guo,et al.  A unified algorithm based on HTS and self-adapting PSO for the construction of octagonal and rectilinear SMT , 2019, Soft Computing.

[14]  Jiayi Luo,et al.  Skeleton-based action recognition by part-aware graph convolutional networks , 2019, The Visual Computer.

[15]  Heng Tao Shen,et al.  Temporal Reasoning Graph for Activity Recognition , 2019, IEEE Transactions on Image Processing.

[16]  Anoop Cherian,et al.  Tensor Representations via Kernel Linearization for Action Recognition from 3D Skeletons , 2016, ECCV.

[17]  Nanning Zheng,et al.  Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Zhenghao Chen,et al.  Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Gang Wang,et al.  NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Erik Blasch,et al.  Multi-source Multi-modal Activity Recognition in Aerial Video Surveillance , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[21]  Ling Shao,et al.  Action Recognition From Arbitrary Views Using Transferable Dictionary Learning , 2018, IEEE Transactions on Image Processing.

[22]  Saeed Sharifian,et al.  Modified deep residual network architecture deployed on serverless framework of IoT platform based on human activity recognition application , 2019, Future Gener. Comput. Syst..

[23]  Zechao Li,et al.  Nonpeaked Discriminant Analysis for Data Representation , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[24]  B. G. Prasad,et al.  An IoT Based Framework For Activity Recognition Using Deep Learning Technique , 2019, ArXiv.

[25]  Guolong Chen,et al.  A multi-label classification algorithm based on kernel extreme learning machine , 2017, Neurocomputing.

[26]  Rama Chellappa,et al.  Rolling Rotations for Recognizing Human Actions from 3D Skeletal Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  P. J. Narayanan,et al.  Part-based Graph Convolutional Network for Action Recognition , 2018, BMVC.

[28]  Wenjun Zeng,et al.  An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.

[29]  Ayesha Gurnani,et al.  Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[30]  Lei Shi,et al.  Adaptive Spectral Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, ArXiv.

[31]  Bruno Berberian,et al.  Action co-representation and the sense of agency during a joint Simon task: Comparing human and machine co-agents , 2019, Consciousness and Cognition.

[32]  Tinghuai Ma,et al.  LGIEM: Global and local node influence based community detection , 2020, Future Gener. Comput. Syst..

[33]  Songchuan Zhang,et al.  A complex-valued multichannel speech enhancement learning algorithm for optimal tradeoff between noise reduction and speech distortion , 2017, Neurocomputing.

[34]  Liang Zhang,et al.  Topology-learnable graph convolution for skeleton-based action recognition , 2020, Pattern Recognit. Lett..

[35]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[36]  Hassan Foroosh,et al.  Self-Attention Network for Skeleton-based Human Action Recognition , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[37]  Yanfeng Wang,et al.  Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Satoshi Nakamura,et al.  Make Skeleton-based Action Recognition Model Smaller, Faster and Better , 2019, MMAsia.

[40]  Lei Shi,et al.  Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Lei Shi,et al.  Skeleton-Based Action Recognition With Directed Graph Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Jason Jianjun Gu,et al.  An Efficient Method for Traffic Sign Recognition Based on Extreme Learning Machine , 2017, IEEE Transactions on Cybernetics.

[43]  Guolong Chen,et al.  Multilayer Obstacle-Avoiding X-Architecture Steiner Minimal Tree Construction Based on Particle Swarm Optimization , 2015, IEEE Transactions on Cybernetics.

[44]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[45]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Yong Du,et al.  Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition , 2016, IEEE Transactions on Image Processing.

[47]  Patrick M. Pilarski,et al.  Learned human-agent decision-making, communication and joint action in a virtual reality environment , 2019, ArXiv.

[48]  Vittorio Murino,et al.  Scalable and Compact 3D Action Recognition with Approximated RBF Kernel Machines , 2017, Pattern Recognit..

[49]  M. Shamim Hossain,et al.  Multimedia-oriented action recognition in Smart City-based IoT using multilayer perceptron , 2019, Multimedia Tools and Applications.

[50]  Ming Zhou,et al.  Hierarchical Recurrent Neural Network for Document Modeling , 2015, EMNLP.

[51]  Yifan Zhang,et al.  Skeleton-Based Action Recognition With Shift Graph Convolutional Network , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Chao Li,et al.  Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation , 2018, IJCAI.

[53]  Guolong Chen,et al.  Human action recognition via multi-task learning base on spatial-temporal feature , 2015, Inf. Sci..

[54]  Yuzhen Niu,et al.  Fitting-based optimisation for image visual salient object detection , 2017, IET Comput. Vis..

[55]  Sanghoon Lee,et al.  Ensemble Deep Learning for Skeleton-Based Action Recognition Using Temporal Sliding LSTM Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  Jian Yang,et al.  Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition , 2018, AAAI.

[57]  Tieniu Tan,et al.  Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning , 2018, ECCV.

[58]  Guolong Chen,et al.  A PSO-based timing-driven Octilinear Steiner tree algorithm for VLSI routing considering bend reduction , 2015, Soft Comput..

[59]  Fei Wu,et al.  Spatio-Temporal Graph Routing for Skeleton-Based Action Recognition , 2019, AAAI.

[60]  Gang Wang,et al.  NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[62]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[63]  Muqing Wu,et al.  Human Action Recognition Using Multilevel Depth Motion Maps , 2019, IEEE Access.

[64]  Tieniu Tan,et al.  Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network , 2020, Pattern Recognit..

[65]  Xing Liu,et al.  Rotation-based spatial–temporal feature learning from skeleton sequences for action recognition , 2020, Signal Image Video Process..

[66]  David Picard,et al.  Learning features combination for human action recognition from skeleton sequences , 2017, Pattern Recognit. Lett..

[67]  Feng Li,et al.  Two-Stream Temporal Convolutional Networks for Skeleton-Based Human Action Recognition , 2020, Journal of Computer Science and Technology.

[68]  Fabio Viola,et al.  The Kinetics Human Action Video Dataset , 2017, ArXiv.

[69]  Yuzhen Niu,et al.  Fast Gaussian kernel learning for classification tasks based on specially structured global optimization , 2014, Neural Networks.

[70]  Hong Liu,et al.  Sample Fusion Network: An End-to-End Data Augmentation Network for Skeleton-Based Human Action Recognition , 2019, IEEE Transactions on Image Processing.

[71]  Luc Van Gool,et al.  Deep Learning on Lie Groups for Skeleton-Based Action Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Guolong Chen,et al.  Relative influence maximization in competitive social networks , 2017, Science China Information Sciences.

[73]  Xu Chen,et al.  Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  N. Sengottaiyan,et al.  Automatic Human Activity Recognition in Video Surveillance System Using Versatile Quadric Activity Portion Classification Method , 2019 .

[75]  Wenzhong Guo,et al.  Robust co-clustering via dual local learning and high-order matrix factorization , 2017, Knowl. Based Syst..