论文信息 - Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks

Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks

We study a new paradigm of knowledge transfer that aims at encoding graph topological information into graph neural networks (GNNs) by distilling knowledge from a teacher GNN model trained on a complete graph to a student GNN model operating on a smaller or sparser graph. To this end, we revisit the connection between thermodynamics and the behavior of GNN, based on which we propose Neural Heat Kernel (NHK) to encapsulate the geometric property of the underlying manifold concerning the architecture of GNNs. A fundamental and principled solution is derived by aligning NHKs on teacher and student models, dubbed as Geometric Knowledge Distillation. We develop non- and parametric instantiations and demonstrate their efficacy in various experimental settings for knowledge distillation regarding different types of privileged topological information and teacher-student schemes.

Junchi Yan | Qitian Wu | Chenxiao Yang | Chenxiao Yang

[1] Guihai Chen,et al. Cross-Task Knowledge Distillation in Multi-Task Recommendation , 2022, AAAI.

[2] Francesco Di Giovanni,et al. Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs , 2022, NeurIPS.

[3] Junchi Yan,et al. Handling Distribution Shifts on Graphs: An Invariance Perspective , 2022, ICLR.

[4] Francesco Di Giovanni,et al. Understanding over-squashing and bottlenecks on graphs via curvature , 2021, ICLR.

[5] Junchi Yan,et al. GraphDE: A Generative Framework for Debiased Learning and Out-of-Distribution Detection on Graphs , 2022, NeurIPS.

[6] Junchi Yan,et al. NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification , 2023, NeurIPS.

[7] S. Osher,et al. GRAND++: Graph Neural Diffusion with A Source Term , 2022, ICLR.

[8] Francesco Di Giovanni,et al. Graph Neural Networks as Gradient Flows , 2022, ArXiv.

[9] Davide Eynard,et al. Beltrami Flow and Neural Diffusion on Graphs , 2021, NeurIPS.

[10] Qitian Wu,et al. Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach , 2021, NeurIPS.

[11] Eran Treister,et al. PDE-GCN: Novel Architectures for Graph Neural Networks Motivated by Partial Differential Equations , 2021, NeurIPS.

[12] Carl Yang,et al. Subgraph Federated Learning with Missing Neighbor Generation , 2021, NeurIPS.

[13] Michael M. Bronstein,et al. GRAND: Graph Neural Diffusion , 2021, ICML.

[14] Le Wu,et al. Privileged Graph Distillation for Cold Start Recommendation , 2021, SIGIR.

[15] Joan Bruna,et al. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges , 2021, ArXiv.

[16] Chuan Shi,et al. Extract the Knowledge of Graph Neural Networks and Go Beyond it: An Effective Knowledge Distillation Framework , 2021, WWW.

[17] Zhouchen Lin,et al. Dissecting the Diffusion Process in Linear Graph Convolutional Networks , 2021, NeurIPS.

[18] Ken-ichi Kawarabayashi,et al. How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks , 2020, ICLR.

[19] Jianping Gou,et al. Knowledge Distillation: A Survey , 2020, International Journal of Computer Vision.

[20] Bencheng Yan,et al. TinyGNN: Learning Efficient Graph Neural Networks , 2020, KDD.

[21] Richard Peng,et al. Faster Graph Embeddings via Coarsening , 2020, ICML.

[22] J. Leskovec,et al. Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[23] D. Tao,et al. Distilling Knowledge From Graph Convolutional Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Stefanie Jegelka,et al. Generalization and Representational Limits of Graph Neural Networks , 2020, ICML.

[25] Hossein Mobahi,et al. Self-Distillation Amplifies Regularization in Hilbert Space , 2020, NeurIPS.

[26] Jiawei Zhang,et al. Graph-Bert: Only Attention is Needed for Learning Graph Representations , 2020, ArXiv.

[27] Yu Jin,et al. Graph Coarsening with Preserved Spectral Properties , 2018, AISTATS.

[28] Stephan Günnemann,et al. Diffusion Improves Graph Learning , 2019, NeurIPS.

[29] Yan Lu,et al. Relational Knowledge Distillation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Sangdoo Yun,et al. A Comprehensive Overhaul of Feature Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31] U. Feige,et al. Spectral Graph Theory , 2015 .

[32] Kilian Q. Weinberger,et al. Simplifying Graph Convolutional Networks , 2019, ICML.

[33] Stephan Günnemann,et al. Predict then Propagate: Graph Neural Networks meet Personalized PageRank , 2018, ICLR.

[34] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.

[35] Jangho Kim,et al. Paraphrasing Complex Network: Network Compression via Factor Transfer , 2018, NeurIPS.