$\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks

As graph data size increases, the vast latency and memory consumption during inference pose a significant challenge to the real-world deployment of Graph Neural Networks (GNNs). While quantization is a powerful approach to reducing GNNs complexity, most previous works on GNNs quantization fail to exploit the unique characteristics of GNNs, suffering from severe accuracy degradation. Through an in-depth analysis of the topology of GNNs, we observe that the topology of the graph leads to significant differences between nodes, and most of the nodes in a graph appear to have a small aggregation value. Motivated by this, in this paper, we propose the Aggregation-Aware mixed-precision Quantization ($\rm A^2Q$) for GNNs, where an appropriate bitwidth is automatically learned and assigned to each node in the graph. To mitigate the vanishing gradient problem caused by sparse connections between nodes, we propose a Local Gradient method to serve the quantization error of the node features as the supervision during training. We also develop a Nearest Neighbor Strategy to deal with the generalization on unseen graphs. Extensive experiments on eight public node-level and graph-level datasets demonstrate the generality and robustness of our proposed method. Compared to the FP32 models, our method can achieve up to a 18.6x (i.e., 1.70bit) compression ratio with negligible accuracy degradation. Morever, compared to the state-of-the-art quantization method, our method can achieve up to 11.4\% and 9.5\% accuracy improvements on the node-level and graph-level tasks, respectively, and up to 2x speedup on a dedicated hardware accelerator.

[1]  Zhe Zhang,et al.  EPQuant: A Graph Neural Network compression approach based on product quantization , 2022, Neurocomputing.

[2]  Tom Goldstein,et al.  VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization , 2021, NeurIPS.

[3]  Peisong Wang,et al.  Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Dacheng Tao,et al.  Meta-Aggregator: Learning to Aggregate for 1-bit Graph Neural Networks , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Stefanos Zafeiriou,et al.  Binary Graph Neural Networks , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  A. Stephen McGough,et al.  Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural Networks , 2020, 2020 IEEE International Conference on Big Data (Big Data).

[7]  Yunhong Wang,et al.  Bi-GCN: Binary Graph Convolutional Network , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Pietro Lio,et al.  Learned Low Precision Graph Neural Networks , 2020, ArXiv.

[9]  Depeng Jin,et al.  Multi-behavior Recommendation with Graph Convolutional Networks , 2020, SIGIR.

[10]  Xu Li,et al.  SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization , 2020, 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI).

[11]  V. Sze,et al.  Efficient Processing of Deep Neural Networks , 2020, Synthesis Lectures on Computer Architecture.

[12]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[13]  Xuemin Lin,et al.  Binarized graph neural network , 2020, World Wide Web.

[14]  Michael W. Mahoney,et al.  HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks , 2019, NeurIPS.

[15]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[16]  T. Kemp,et al.  Mixed Precision DNNs: All you need is a good parametrization , 2019, ICLR.

[17]  Kurt Keutzer,et al.  HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Alexander Peysakhovich,et al.  PyTorch-BigGraph: A Large-scale Graph Embedding System , 2019, SysML.

[19]  C. Dick,et al.  Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks , 2019, MLSys.

[20]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[21]  Chang Zhou,et al.  AliGraph: A Comprehensive Graph Neural Network Platform , 2019, Proc. VLDB Endow..

[22]  Steven K. Esser,et al.  Learned Step Size Quantization , 2019, ICLR.

[23]  Minje Kim,et al.  AutoQ: Automated Kernel-Wise Neural Network Quantization , 2019, ICLR.

[24]  Zhijian Liu,et al.  HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[26]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[27]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[28]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[29]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[30]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[31]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[32]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[33]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[34]  Zhihua Zhang,et al.  Distributed Power-law Graph Computing: Theoretical and Empirical Analysis , 2014, NIPS.

[35]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[36]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Fan Chung Graham,et al.  A Random Graph Model for Power Law Graphs , 2001, Exp. Math..

[38]  Yoshua Bengio,et al.  Benchmarking Graph Neural Networks , 2023, J. Mach. Learn. Res..

[39]  Patrick Judd,et al.  Stripes: Bit-serial deep neural network computing , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[40]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.