Node Duplication Improves Cold-start Link Prediction

Graph Neural Networks (GNNs) are prominent in graph machine learning and have shown state-of-the-art performance in Link Prediction (LP) tasks. Nonetheless, recent studies show that GNNs struggle to produce good results on low-degree nodes despite their overall strong performance. In practical applications of LP, like recommendation systems, improving performance on low-degree nodes is critical, as it amounts to tackling the cold-start problem of improving the experiences of users with few observed interactions. In this paper, we investigate improving GNNs' LP performance on low-degree nodes while preserving their performance on high-degree nodes and propose a simple yet surprisingly effective augmentation technique called NodeDup. Specifically, NodeDup duplicates low-degree nodes and creates links between nodes and their own duplicates before following the standard supervised LP training scheme. By leveraging a ''multi-view'' perspective for low-degree nodes, NodeDup shows significant LP performance improvements on low-degree nodes without compromising any performance on high-degree nodes. Additionally, as a plug-and-play augmentation module, NodeDup can be easily applied to existing GNNs with very light computational cost. Extensive experiments show that NodeDup achieves 38.49%, 13.34%, and 6.76% improvements on isolated, low-degree, and warm nodes, respectively, on average across all datasets compared to GNNs and state-of-the-art cold-start methods.

[1]  Yuan Fang,et al.  On Generalized Degree Fairness in Graph Neural Networks , 2023, AAAI.

[2]  N. Chawla,et al.  FakeEdge: Alleviate Dataset Shift in Link Prediction , 2022, LoG.

[3]  E. Papalexakis,et al.  Link Prediction with Non-Contrastive Learning , 2022, ICLR.

[4]  N. Chawla,et al.  Linkless Link Prediction via Relational Distillation , 2022, ICML.

[5]  Neil Shah,et al.  MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization , 2022, ICLR.

[6]  Nils Y. Hammerla,et al.  Graph Neural Networks for Link Prediction with Subgraph Sketching , 2022, ICLR.

[7]  N. Chawla,et al.  Graph-based Molecular Representation Learning , 2022, IJCAI.

[8]  Hyunwoo J. Kim,et al.  Neo-GNNs: Neighborhood Overlap-aware Graph Neural Networks for Link Prediction , 2022, NeurIPS.

[9]  Meng Jiang,et al.  Graph Rationalization with Environment-based Augmentations , 2022, KDD.

[10]  Jiliang Tang,et al.  Are Message Passing Neural Networks Really Helpful for Knowledge Graph Completion? , 2022, ACL.

[11]  Hyunwoo J. Kim,et al.  Metropolis-Hastings Data Augmentation for Graph Neural Networks , 2022, NeurIPS.

[12]  Meng Jiang,et al.  Graph Data Augmentation for Graph Machine Learning: A Survey , 2022, IEEE Data Eng. Bull..

[13]  Hanghang Tong,et al.  Data Augmentation for Deep Graph Learning , 2022, SIGKDD Explor..

[14]  Neil Shah,et al.  Friend Story Ranking with Edge-Contextual Local Graph Convolutions , 2022, WSDM.

[15]  Junchi Yan,et al.  Handling Distribution Shifts on Graphs: An Invariance Perspective , 2022, ICLR.

[16]  Zhitao Wang,et al.  Pairwise Learning for Neural Link Prediction , 2021, ArXiv.

[17]  Edward W. Huang,et al.  Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods , 2021, ICLR.

[18]  Cho-Jui Hsieh,et al.  Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction , 2021, ICLR.

[19]  Junzhou Huang,et al.  Local Augmentation for Graph Neural Networks , 2021, ICML.

[20]  Yuan Fang,et al.  Tail-GNN: Tail-Node Graph Neural Networks , 2021, KDD.

[21]  Xiaorui Liu,et al.  Graph Trend Filtering Networks for Recommendation , 2021, SIGIR.

[22]  Jian Tang,et al.  Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction , 2021, NeurIPS.

[23]  Wenhao Yu,et al.  Learning from Counterfactual Links for Link Prediction , 2021, ICML.

[24]  Yifei Ma,et al.  Zero-Shot Recommender Systems , 2021, ArXiv.

[25]  Jimeng Sun,et al.  SafeDrug: Dual Molecular Graph Encoders for Recommending Effective and Safe Drug Combinations , 2021, IJCAI.

[26]  Neil Shah,et al.  Graph Neural Networks for Friend Ranking in Large-scale Social Platforms , 2021, WWW.

[27]  Zhi Tang,et al.  Link Prediction with Persistent Homology: An Interactive View , 2021, ICML.

[28]  Yuanzhi Li,et al.  Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning , 2020, ICLR.

[29]  Hong Chen,et al.  Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation , 2020, WSDM.

[30]  Yinglong Xia,et al.  Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning , 2020, NeurIPS.

[31]  Jie Wang,et al.  Line Graph Neural Networks for Link Prediction , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Steven C. H. Hoi,et al.  Towards Locality-Aware Meta-Learning of Tail Node Embeddings on Networks , 2020, CIKM.

[33]  Hanwang Zhang,et al.  Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect , 2020, NeurIPS.

[34]  Wenjie Li,et al.  Neighborhood Attention Networks With Adversarial Learning for Link Prediction , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Siyi Liu,et al.  Long-tail Session-based Recommendation , 2020, RecSys.

[36]  Hongsheng Li,et al.  Balanced Meta-Softmax for Long-Tailed Visual Recognition , 2020, NeurIPS.

[37]  Chuan Shi,et al.  Meta-learning on Heterogeneous Information Networks for Cold-start Recommendation , 2020, KDD.

[38]  Jiliang Tang,et al.  Investigating and Mitigating Degree-Related Biases in Graph Convoltuional Networks , 2020, CIKM.

[39]  Leonardo Neves,et al.  Data Augmentation for Graph Neural Networks , 2020, AAAI.

[40]  Qian Xu,et al.  Graph Random Neural Networks for Semi-Supervised Learning on Graphs , 2020, NeurIPS.

[41]  Hongbo Deng,et al.  ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance , 2020, SIGIR.

[42]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[43]  Shuiwang Ji,et al.  A Multi-Scale Approach for Graph Link Prediction , 2020, AAAI.

[44]  Junjie Yan,et al.  Equalization Loss for Long-Tailed Object Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Xiangnan He,et al.  LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation , 2020, SIGIR.

[46]  Nitesh V. Chawla,et al.  Few-Shot Knowledge Graph Completion , 2019, AAAI.

[47]  P. Talukdar,et al.  Composition-based Multi-Relational Graph Convolutional Networks , 2019, ICLR.

[48]  Saining Xie,et al.  Decoupling Representation and Classifier for Long-Tailed Recognition , 2019, ICLR.

[49]  L. Akoglu,et al.  PairNorm: Tackling Oversmoothing in GNNs , 2019, ICLR.

[50]  Krzysztof Janowicz,et al.  TransGCN: Coupling Transformation Assumptions with Graph Convolutional Networks for Link Prediction , 2019, K-CAP.

[51]  Junzhou Huang,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2019, ICLR.

[52]  Jingrui He,et al.  DEMO-Net: Degree-specific Graph Neural Networks for Node and Graph Classification , 2019, KDD.

[53]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[54]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[55]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[56]  Deng Cai,et al.  Addressing the Item Cold-Start Problem by Attribute-Driven Active Learning , 2018, IEEE Transactions on Knowledge and Data Engineering.

[57]  Nicola De Cao,et al.  Hyperspherical Variational Auto-Encoders , 2018, UAI 2018.

[58]  Albert-László Barabási,et al.  Network-based prediction of protein interactions , 2018, Nature Communications.

[59]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[60]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[61]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[62]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[63]  Rianne van den Berg,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[64]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[65]  Mustafa Coskun,et al.  Drug Response Prediction as a Link Prediction Problem , 2017, Scientific Reports.

[66]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[67]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[68]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[69]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[70]  Foster Provost,et al.  Machine Learning from Imbalanced Data Sets 101 , 2008 .

[71]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[72]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[73]  J. Leskovec,et al.  TuneUp: A Training Strategy for Improving Generalization of Graph Neural Networks , 2022, ArXiv.

[74]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[75]  Maksims Volkovs,et al.  DropoutNet: Addressing Cold Start in Recommender Systems , 2017, NIPS.