论文信息 - VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization - 字舞流文

VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization

Most state-of-the-art Graph Neural Networks (GNNs) can be defined as a form of graph convolution which can be realized by message passing between direct neighbors or beyond. To scale such GNNs to large graphs, various neighbor-, layer-, or subgraph-sampling techniques are proposed to alleviate the “neighbor explosion” problem by considering only a small subset of messages passed to the nodes in a mini-batch. However, sampling-based methods are difficult to apply to GNNs that utilize many-hops-away or global context each layer, show unstable performance for different tasks and datasets, and do not speed up model inference. We propose a principled and fundamentally different approach, VQ-GNN, a universal framework to scale up any convolution-based GNNs using Vector Quantization (VQ) without compromising the performance. In contrast to sampling-based techniques, our approach can effectively preserve all the messages passed to a mini-batch of nodes by learning and updating a small number of quantized reference vectors of global node representations, using VQ within each GNN layer. Our framework avoids the “neighbor explosion” problem of GNNs using quantized representations combined with a low-rank version of the graph convolution matrix. We show that such a compact low-rank version of the gigantic convolution matrix is sufficient both theoretically and experimentally. In company with VQ, we design a novel approximated message passing algorithm and a nontrivial back-propagation rule for our framework. Experiments on various types of GNN backbones demonstrate the scalability and competitive performance of our framework on large-graph node classification and link prediction benchmarks.

Tom Goldstein | Chen Zhu | John P. Dickerson | Mucong Ding | Jingling Li | John P Dickerson | Kezhi Kong | Furong Huang | Furong Huang | T. Goldstein | Chen Zhu | Jingling Li | Kezhi Kong | Mucong Ding | Furong Huang

[1] Nicholas D. Lane,et al. Degree-Quant: Quantization-Aware Training for Graph Neural Networks , 2021, ICLR.

[2] Hongxu Chen,et al. Is Attention Better Than Matrix Decomposition? , 2021, ICLR.

[3] Daniel M. Kane,et al. Sparser Johnson-Lindenstrauss Transforms , 2010, JACM.

[4] Le Song,et al. Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[5] Julien Mairal,et al. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[6] Yaron Lipman,et al. Provably Powerful Graph Networks , 2019, NeurIPS.

[7] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.

[8] Jiawei Zhang,et al. Graph-Bert: Only Attention is Needed for Learning Graph Representations , 2020, ArXiv.

[10] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.

[11] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[12] Pietro Liò,et al. Graph Attention Networks , 2017, ICLR.

[13] Davide Eynard,et al. SIGN: Scalable Inception Graph Neural Networks , 2020, ArXiv.

[14] Jan Eric Lenssen,et al. Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[15] Ali Razavi,et al. Generating Diverse High-Fidelity Images with VQ-VAE-2 , 2019, NeurIPS.

[16] Cao Xiao,et al. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[17] Donald F. Towsley,et al. Diffusion-Convolutional Neural Networks , 2015, NIPS.

[18] David Berthelot,et al. Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer , 2018, ICLR.

[19] Hanwei Wu,et al. Learning Product Codebooks Using Vector-Quantized Autoencoders for Image Retrieval , 2019, 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[20] Ole Winther,et al. BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling , 2019, NeurIPS.

[21] Lorenzo Livi,et al. Graph Neural Networks With Convolutional ARMA Filters , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Chris H. Q. Ding,et al. K-means clustering via principal component analysis , 2004, ICML.

[23] Junzhou Huang,et al. Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[24] Richard S. Zemel,et al. Gated Graph Sequence Neural Networks , 2015, ICLR.

[25] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.

[26] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.

[27] Yizhou Sun,et al. Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks , 2019, NeurIPS.

[28] Rajgopal Kannan,et al. GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[29] Jure Leskovec,et al. How Powerful are Graph Neural Networks? , 2018, ICLR.

[30] Xavier Bresson,et al. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[31] Yatao Bian,et al. Self-Supervised Graph Transformer on Large-Scale Molecular Data , 2020, NeurIPS.

[32] Andreas Loukas,et al. What graph neural networks cannot learn: depth vs width , 2019, ICLR.

[33] Johannes Klicpera,et al. Scaling Graph Neural Networks with Approximate PageRank , 2020, KDD.

[34] Kilian Q. Weinberger,et al. Simplifying Graph Convolutional Networks , 2019, ICML.

[35] George Dasoulas,et al. Lipschitz Normalization for Self-Attention Layers with Application to Graph Neural Networks , 2021, ICML.

[36] Xu Li,et al. SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization , 2020, 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI).

[37] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.

[38] Martin Grohe,et al. Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks , 2018, AAAI.

[39] Paul Honeine,et al. Analyzing the Expressive Power of Graph Neural Networks in a Spectral Perspective , 2021, ICLR.

[40] Stephan Günnemann,et al. Diffusion Improves Graph Learning , 2019, NeurIPS.

[41] J. Leskovec,et al. Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[42] Samy Bengio,et al. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks , 2019, KDD.

[43] Ken-ichi Kawarabayashi,et al. What Can Neural Networks Reason About? , 2019, ICLR.

[44] Pietro Liò,et al. Principal Neighbourhood Aggregation for Graph Nets , 2020, NeurIPS.

[45] Xavier Bresson,et al. CayleyNets: Graph Convolutional Neural Networks With Complex Rational Spectral Filters , 2017, IEEE Transactions on Signal Processing.

[46] W. B. Johnson,et al. Extensions of Lipschitz mappings into Hilbert space , 1984 .