Accelerating Large Scale Real-Time GNN Inference using Channel Pruning

Graph Neural Networks (GNNs) are proven to be powerful models to generate node embedding for downstream applications. However, due to the high computation complexity of GNN inference, it is hard to deploy GNNs for large-scale or real-time applications. In this paper, we propose to accelerate GNN inference by pruning the dimensions in each layer with negligible accuracy loss. Our pruning framework uses a novel LASSO regression formulation for GNNs to identify feature dimensions (channels) that have high influence on the output activation. We identify two inference scenarios and design pruning schemes based on their computation and memory usage for each. To further reduce the inference complexity, we effectively store and reuse hidden features of visited nodes, which significantly reduces the number of supporting nodes needed to compute the target embedding. We evaluate the proposed method with the node classification problem on five popular datasets and a real-time spam detection application. We demonstrate that the prunedGNNmodels greatly reduce computation andmemory usage with little accuracy loss. For full inference, the proposed method achieves an average of 3.27× speedup with only 0.002 drop in F1Micro onGPU. For batched inference, the proposedmethod achieves an average of 6.67× speedup with only 0.003 drop in F1-Micro on CPU. To the best of our knowledge, we are the first to accelerate large scale real-time GNN inference through channel pruning. PVLDB Reference Format: Hongkuan Zhou, Ajitesh Srivastava, Hanqing Zeng, Rajgopal Kannan, and Viktor Prasanna. Accelerating Large Scale Real-Time GNN Inference using Channel Pruning. PVLDB, 14(9): XXX-XXX, 2021. doi:10.14778/3461535.3461547 PVLDB Availability Tag: The source code of this research paper has been made publicly available at https://github.com/tedzhouhk/GCNP.

[1]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[2]  Leman Akoglu,et al.  Collective Opinion Spam Detection: Bridging Review Networks and Metadata , 2015, KDD.

[3]  Yong Li,et al.  Spatial temporal graph convolutional networks for skeleton-based dynamic hand gesture recognition , 2019, EURASIP J. Image Video Process..

[4]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[5]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[6]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Song-Chun Zhu,et al.  Learning Human-Object Interactions by Graph Parsing Neural Networks , 2018, ECCV.

[8]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[9]  Bryan Perozzi,et al.  Grale: Designing Networks for Graph Learning , 2020, KDD.

[10]  Xiaohui Xie,et al.  Dynamically Pruned Message Passing Networks for Large-Scale Knowledge Graph Reasoning , 2020, ICLR.

[11]  Yongliang Li,et al.  Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation , 2019, KDD.

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[14]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[15]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Kristina Lerman,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[17]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[18]  Viktor K. Prasanna,et al.  Accurate, Efficient and Scalable Graph Embedding , 2018, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[19]  Johannes Klicpera,et al.  Scaling Graph Neural Networks with Approximate PageRank , 2020, KDD.

[20]  Irwin King,et al.  STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems , 2019, IJCAI.

[21]  James Zijun Wang,et al.  Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers , 2018, ICLR.

[22]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[23]  Yiran Chen,et al.  GraphR: Accelerating Graph Processing Using ReRAM , 2017, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[24]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[25]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[26]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[27]  Viktor Prasanna,et al.  Hardware Acceleration of Large Scale GCN Inference , 2020, 2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[28]  Yafei Dai,et al.  PCGCN: Partition-Centric Processing for Accelerating Graph Convolutional Network , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[29]  Bencheng Yan,et al.  TinyGNN: Learning Efficient Graph Neural Networks , 2020, KDD.

[30]  Liwei Wang,et al.  GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training , 2020, ICML.

[31]  Jun Zhou,et al.  A Semi-Supervised Graph Attentive Network for Financial Fraud Detection , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[32]  Dongrui Fan,et al.  HyGCN: A GCN Accelerator with Hybrid Architecture , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[33]  Le Song,et al.  Heterogeneous Graph Neural Networks for Malicious Account Detection , 2018, CIKM.

[34]  Sanja Fidler,et al.  3D Graph Neural Networks for RGBD Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[36]  Jiawei Zhang,et al.  Graph-Bert: Only Attention is Needed for Learning Graph Representations , 2020, ArXiv.

[37]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[38]  Nikos Komodakis,et al.  Generating Classification Weights With GNN Denoising Autoencoders for Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[40]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[41]  Dong Li,et al.  Spam Review Detection with Graph Convolutional Networks , 2019, CIKM.

[42]  Jianxin Wu,et al.  ThiNet: Pruning CNN Filters for a Thinner Net , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Ning Feng,et al.  Attention Based Spatial-Temporal Graph Convolutional Networks for Traffic Flow Forecasting , 2019, AAAI.