Spatio-Temporal Graph Scattering Transform

Although spatio-temporal graph neural networks have achieved great empirical success in handling multiple correlated time series, they may be impractical in some real-world scenarios due to a lack of sufficient high-quality training data. Furthermore, spatio-temporal graph neural networks lack theoretical interpretation. To address these issues, we put forth a novel mathematically designed framework to analyze spatio-temporal data. Our proposed spatio-temporal graph scattering transform (ST-GST) extends traditional scattering transforms to the spatio-temporal domain. It performs iterative applications of spatio-temporal graph wavelets and nonlinear activation functions, which can be viewed as a forward pass of spatio-temporal graph convolutional networks without training. Since all the filter coefficients in ST-GST are mathematically designed, it is promising for the real-world scenarios with limited training data, and also allows for a theoretical analysis, which shows that the proposed ST-GST is stable to small perturbations of input signals and structures. Finally, our experiments show that i) ST-GST outperforms spatio-temporal graph convolutional networks by an increase of 35% in accuracy for MSR Action3D dataset; ii) it is better and computationally more efficient to design the transform based on separable spatio-temporal graphs than the joint ones; and iii) the nonlinearity in ST-GST is critical to empirical performance.

[1]  Feng Gao,et al.  Geometric Scattering for Graph Data Analysis , 2018, ICML.

[2]  Fernando Gama,et al.  Stability of Graph Scattering Transforms , 2019, NeurIPS.

[3]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[4]  Santiago Segarra,et al.  Network Topology Inference from Spectral Templates , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[5]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[6]  Xu Chen,et al.  Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[8]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[9]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[10]  Alejandro Ribeiro,et al.  Diffusion Scattering Transforms on Graphs , 2018, ICLR.

[11]  Yue Hu,et al.  Collaborative Motion Prediction via Neural Motion Message Passing , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yifan Zhang,et al.  Skeleton-Based Action Recognition With Shift Graph Convolutional Network , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  R. Haddad,et al.  Multiresolution Signal Decomposition: Transforms, Subbands, and Wavelets , 1992 .

[14]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[16]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[17]  Georgios B. Giannakis,et al.  Pruned Graph Scattering Transforms , 2020, ICLR.

[18]  Gitta Kutyniok,et al.  On the Transferability of Spectral Graph Filters , 2019, 2019 13th International conference on Sampling Theory and Applications (SampTA).

[19]  Yonina C. Eldar,et al.  Algorithm Unrolling: Interpretable, Efficient Deep Learning for Signal and Image Processing , 2019, IEEE Signal Processing Magazine.

[20]  Gilad Lerman,et al.  Graph Convolutional Neural Networks via Scattering , 2018, Applied and Computational Harmonic Analysis.

[21]  Pierre Vandergheynst,et al.  Spectrum-Adapted Tight Graph Wavelet and Vertex-Frequency Frames , 2013, IEEE Transactions on Signal Processing.

[22]  Zhenghao Chen,et al.  Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Rui Zhao,et al.  Bayesian Hierarchical Dynamic Model for Human Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  José M. F. Moura,et al.  Big Data Analysis with Signal Processing on Graphs: Representation and processing of massive data sets with irregular structure , 2014, IEEE Signal Processing Magazine.

[25]  Pierre Vandergheynst,et al.  Wavelets on Graphs via Spectral Graph Theory , 2009, ArXiv.

[26]  Austin Reiter,et al.  Interpretable 3D Human Action Analysis with Temporal Convolutional Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[27]  Andrea Cavallaro,et al.  Video-Based Human Behavior Understanding: A Survey , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Andreas Loukas,et al.  A Time-Vertex Signal Processing Framework: Scalable Processing and Meaningful Representations for Time-Series on Graphs , 2017, IEEE Transactions on Signal Processing.

[29]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[30]  Antonio Ortega,et al.  Graph Based Skeleton Modeling for Human Activity Analysis , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[31]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[32]  Gang Wang,et al.  NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.