Edge Proposal Sets for Link Prediction

Graphs are a common model for complex relational data such as social networks and protein interactions, and such data can evolve over time (e.g., new friendships) and be noisy (e.g., unmeasured interactions). Link prediction aims to predict future edges or infer missing edges in the graph, and has diverse applications in recommender systems, experimental design, and complex systems. Even though link prediction algorithms strongly depend on the set of edges in the graph, existing approaches typically do not modify the graph topology to improve performance. Here, we demonstrate how simply adding a set of edges, which we call a proposal set, to the graph as a pre-processing step can improve the performance of several link prediction algorithms. The underlying idea is that if the edges in the proposal set generally align with the structure of the graph, link prediction algorithms are further guided towards predicting the right edges; in other words, adding a proposal set of edges is a signal-boosting pre-processing step. We show how to use existing link prediction algorithms to generate effective proposal sets and evaluate this approach on various synthetic and empirical datasets. We find that proposal sets meaningfully improve the accuracy of link prediction algorithms based on both neighborhood heuristics and graph neural networks. Code is available at https://github.com/CUAI/Edge-Proposal-Sets.

[1]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[2]  Davide Eynard,et al.  SIGN: Scalable Inception Graph Neural Networks , 2020, ArXiv.

[3]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[4]  Purnamrita Sarkar,et al.  The Consistency of Common Neighbors for Link Prediction in Stochastic Blockmodels , 2015, NIPS.

[5]  Noel E. O'Connor,et al.  Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[6]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[7]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[8]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[9]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Holger H. Hoos,et al.  A survey on semi-supervised learning , 2019, Machine Learning.

[11]  Jure Leskovec,et al.  Community Interaction and Conflict on the Web , 2018, WWW.

[12]  Boleslaw K. Szymanski,et al.  Community Detection with Edge Augmentation in Criminal Networks , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[13]  Mohammad Al Hasan,et al.  A Survey of Link Prediction in Social Networks , 2011, Social Network Data Analytics.

[14]  Eran Yahav,et al.  On the Bottleneck of Graph Neural Networks and its Practical Implications , 2021, ICLR.

[15]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[16]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[17]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[18]  Boleslaw K. Szymanski,et al.  Improving Network Community Structure with Link Prediction Ranking , 2016, CompleNet.

[19]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[20]  Michael J. Cafarella,et al.  Link-Prediction Enhanced Consensus Clustering for Complex Networks , 2015, PloS one.

[21]  M Girvan,et al.  Structure of growing social networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Olgica Milenkovic,et al.  Adaptive Universal Generalized PageRank Graph Neural Network , 2021, ICLR.

[23]  E. Todeva Networks , 2007 .

[24]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[25]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[26]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[27]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[28]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[29]  Mark E. J. Newman,et al.  Structural inference for uncertain networks , 2015, Physical review. E.

[30]  Jure Leskovec,et al.  Motifs in Temporal Networks , 2016, WSDM.

[31]  Rik Sarkar,et al.  Multi-scale Attributed Node Embedding , 2019, J. Complex Networks.

[32]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[33]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[34]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[35]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.