A vertex similarity index using community information to improve link prediction accuracy

Link prediction plays an important role in complex network analysis. It is to predict the existence of an unknown link or a future link in a network. Classical methods for link prediction evaluate the similarity of vertices based on common neighbors, and denote that every common neighbor makes equal contribution to the connection likelihood. However, common neighbors may play different roles depending on whether they belong to the same community, where vertices are densely or sparsely connected to other communities. This paper proposes a novel similarity index for link prediction which combines the topology information and community information. The proposed approach is compared with ten classical local similarity indices on ten real-world networks. The experiment results shown that the proposed approach can improve the accuracy of link prediction no matter which community detection algorithm is used.

[1]  John E. Hopcroft,et al.  Using community information to improve the precision of link prediction methods , 2012, WWW.

[2]  Tao Zhou,et al.  Predicting missing links and identifying spurious links via likelihood analysis , 2016, Scientific Reports.

[3]  Alneu de Andrade Lopes,et al.  Link Prediction in Complex Networks Based on Cluster Information , 2012, SBIA.

[4]  Anne Gatchell,et al.  Link Prediction in Social Networks 1 , 2013 .

[5]  Ke Xu,et al.  Link prediction in complex networks: a clustering perspective , 2011, The European Physical Journal B.

[6]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[8]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[9]  Fernando Berzal Galiano,et al.  A Survey of Link Prediction in Complex Networks , 2016, ACM Comput. Surv..

[10]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[11]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[12]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[13]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[15]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[16]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[17]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[19]  Konstantin Avrachenkov,et al.  Cooperative Game Theory Approaches for Network Partitioning , 2017, COCOON.

[20]  Mark E. J. Newman,et al.  Structure and inference in annotated networks , 2015, Nature Communications.

[21]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.