Linkless Link Prediction via Relational Distillation

Graph Neural Networks (GNNs) have been widely used on graph data and have shown exceptional performance in the task of link prediction. Despite their effectiveness, GNNs often suffer from high latency due to non-trivial neighborhood data dependency in practical deployments. To address this issue, researchers have proposed methods based on knowledge distillation (KD) to transfer the knowledge from teacher GNNs to student MLPs, which are known to be efficient even with industrial scale data, and have shown promising results on node classification. Nonetheless, using KD to accelerate link prediction is still unexplored. In this work, we start with exploring two direct analogs of traditional KD for link prediction, i.e., predicted logit-based matching and node representation-based matching. Upon observing direct KD analogs do not perform well for link prediction, we propose a relational KD framework, Linkless Link Prediction (LLP). Unlike simple KD methods that match independent link logits or node representations, LLP distills relational knowledge that is centered around each (anchor) node to the student MLP. Specifically, we propose two matching strategies that complement each other: rank-based matching and distribution-based matching. Extensive experiments demonstrate that LLP boosts the link prediction performance of MLPs with significant margins, and even outperforms the teacher GNNs on 6 out of 9 benchmarks. LLP also achieves a 776.37× speedup in link prediction inference compared to GNNs on the large scale OGB-Citation2 dataset.

[1]  N. Chawla,et al.  NOSMOG: Learning Noise-robust and Structure-aware MLPs on Graphs , 2022, ArXiv.

[2]  Jiliang Tang,et al.  Graph Trend Filtering Networks for Recommendation , 2022, SIGIR.

[3]  Meng Jiang,et al.  Graph Rationalization with Environment-based Augmentations , 2022, KDD.

[4]  Juan L. Reutter,et al.  Expressiveness and Approximation Properties of Graph Neural Networks , 2022, ICLR.

[5]  Haoteng Yin,et al.  Algorithm and System Co-design for Efficient Subgraph-based Graph Representation Learning , 2022, Proc. VLDB Endow..

[6]  Neil Shah,et al.  Friend Story Ranking with Edge-Contextual Local Graph Convolutions , 2022, WSDM.

[7]  Chaitanya K. Joshi,et al.  On Representation Knowledge Distillation for Graph Neural Networks , 2021, IEEE transactions on neural networks and learning systems.

[8]  Edward W. Huang,et al.  Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods , 2021, ICLR.

[9]  Yizhou Sun,et al.  Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation , 2021, ICLR.

[10]  L. Akoglu,et al.  From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness , 2021, ICLR.

[11]  M. Bronstein,et al.  Equivariant Subgraph Aggregation Networks , 2021, ICLR.

[12]  Wenhao Yu,et al.  Learning from Counterfactual Links for Link Prediction , 2021, ICML.

[13]  Michael W. Mahoney,et al.  A Survey of Quantization Methods for Efficient Neural Network Inference , 2021, Low-Power Computer Vision.

[14]  Jie Wang,et al.  Line Graph Neural Networks for Link Prediction , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Wenwu Zhu,et al.  Deep Learning on Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[16]  Hyunwoo J. Kim,et al.  Neo-GNNs: Neighborhood Overlap-aware Graph Neural Networks for Link Prediction , 2022, NeurIPS.

[17]  Zhitao Wang,et al.  Pairwise Learning for Neural Link Prediction , 2021, ArXiv.

[18]  Evangelos E. Papalexakis,et al.  Adversarially Generating Rank-Constrained Graphs , 2021, 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA).

[19]  Jiliang Tang,et al.  Elastic Graph Neural Networks , 2021, ICML.

[20]  Jian Tang,et al.  Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction , 2021, NeurIPS.

[21]  Erjin Zhou,et al.  Graph-MLP: Node Classification without Message Passing in Graph , 2021, ArXiv.

[22]  Yifei Ma,et al.  Zero-Shot Recommender Systems , 2021, ArXiv.

[23]  Xiang Deng,et al.  Graph-Free Knowledge Distillation for Graph Neural Networks , 2021, IJCAI.

[24]  Rajgopal Kannan,et al.  Accelerating Large Scale Real-Time GNN Inference using Channel Pruning , 2021, Proc. VLDB Endow..

[25]  Neil Shah,et al.  Graph Neural Networks for Friend Ranking in Large-scale Social Platforms , 2021, WWW.

[26]  Jiajun Chen,et al.  Topology-Aware Correlations Between Relations for Inductive Link Prediction in Knowledge Graphs , 2021, AAAI.

[27]  Chuan Shi,et al.  Extract the Knowledge of Graph Neural Networks and Go Beyond it: An Effective Knowledge Distillation Framework , 2021, WWW.

[28]  Nitesh V. Chawla,et al.  Few-Shot Graph Learning for Molecular Property Prediction , 2021, WWW.

[29]  Zhangyang Wang,et al.  A Unified Lottery Ticket Hypothesis for Graph Neural Networks , 2021, ICML.

[30]  Jian-Ping Mei,et al.  Cross-Layer Distillation with Semantic Calibration , 2020, AAAI.

[31]  Yinglong Xia,et al.  Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning , 2020, NeurIPS.

[32]  Jiliang Tang,et al.  A Unified View on Graph Neural Networks as Graph Signal Denoising , 2020, CIKM.

[33]  Nicholas D. Lane,et al.  Degree-Quant: Quantization-Aware Training for Graph Neural Networks , 2020, ICLR.

[34]  Leonardo Neves,et al.  Data Augmentation for Graph Neural Networks , 2020, AAAI.

[35]  Jianping Gou,et al.  Knowledge Distillation: A Survey , 2020, International Journal of Computer Vision.

[36]  E. Xing,et al.  Iterative Graph Self-Distillation , 2020, IEEE Transactions on Knowledge and Data Engineering.

[37]  Pietro Lio,et al.  Learned Low Precision Graph Neural Networks , 2020, ArXiv.

[38]  Jure Leskovec,et al.  Distance Encoding: Design Provably More Powerful Neural Networks for Graph Representation Learning , 2020, NeurIPS.

[39]  Bencheng Yan,et al.  TinyGNN: Learning Efficient Graph Neural Networks , 2020, KDD.

[40]  Jure Leskovec,et al.  Redundancy-Free Computation for Graph Neural Networks , 2020, KDD.

[41]  Sibo Wang,et al.  Inductive Link Prediction for Nodes Having Only Attribute Information , 2020, IJCAI.

[42]  Yaliang Li,et al.  Simple and Deep Graph Convolutional Networks , 2020, ICML.

[43]  Lei Chen,et al.  Reliable Data Distillation on Graph Convolutional Network , 2020, SIGMOD Conference.

[44]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[45]  Shuiwang Ji,et al.  A Multi-Scale Approach for Graph Link Prediction , 2020, AAAI.

[46]  D. Tao,et al.  Distilling Knowledge From Graph Convolutional Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Ziqi Liu,et al.  AGL , 2020, Proc. VLDB Endow..

[48]  Xiangnan He,et al.  LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation , 2020, SIGIR.

[49]  Yuxiao Dong,et al.  Microsoft Academic Graph: When experts are not enough , 2020, Quantitative Science Studies.

[50]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[51]  Shengcai Liao,et al.  Exclusivity-Consistency Regularized Knowledge Distillation for Face Recognition , 2020, ECCV.

[52]  Qiaozhu Mei,et al.  Graph Representation Learning via Multi-task Knowledge Distillation , 2019, ArXiv.

[53]  Jure Leskovec,et al.  Hyperbolic Graph Convolutional Neural Networks , 2019, NeurIPS.

[54]  Greg Mori,et al.  Similarity-Preserving Knowledge Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[55]  Zi Huang,et al.  From Zero-Shot Learning to Cold-Start Recommendation , 2019, AAAI.

[56]  Yaron Lipman,et al.  Provably Powerful Graph Networks , 2019, NeurIPS.

[57]  Yan Lu,et al.  Relational Knowledge Distillation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[59]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[60]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[61]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[62]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[63]  Zachary Chase Lipton,et al.  Born Again Neural Networks , 2018, ICML.

[64]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[65]  Nicola De Cao,et al.  Hyperspherical Variational Auto-Encoders , 2018, UAI 2018.

[66]  Emmanuel Müller,et al.  VERSE: Versatile Graph Embeddings from Similarity Measures , 2018, WWW.

[67]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[68]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[69]  Jangho Kim,et al.  Paraphrasing Complex Network: Network Compression via Factor Transfer , 2018, NeurIPS.

[70]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[71]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[72]  Yixin Chen,et al.  Weisfeiler-Lehman Neural Machine for Link Prediction , 2017, KDD.

[73]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[74]  Max Welling,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[75]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[76]  Fernando Berzal Galiano,et al.  A Survey of Link Prediction in Complex Networks , 2016, ACM Comput. Surv..

[77]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[78]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[79]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[80]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[81]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[82]  Anton van den Hengel,et al.  Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[83]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[84]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[85]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[86]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[87]  Philip S. Yu,et al.  Link Mining: Models, Algorithms, and Applications , 2014, Link Mining.

[88]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[89]  Hsinchun Chen,et al.  Link prediction approach to collaborative filtering , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[90]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[91]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[92]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.