Accurate similarity index based on the contributions of paths and end nodes for link prediction

Link prediction whose intent is to discover the likelihood of the existence of a link between two disconnected nodes is an important task in complex network analysis. To perform this task, a similarity-based algorithm that employs the similarities of nodes to find links is a very popular solution. However, when calculating the similarity between two nodes, most of the similarity-based algorithms only focus on the contributions of paths connecting these two nodes but ignore the influences of these two nodes themselves. Therefore, their results are not accurate enough. In this paper, a novel similarity index, called Scop, is proposed for link prediction. By directly defining the contributions of paths to their end nodes and the contributions of end nodes themselves, Scop not only distinguishes the contributions of different paths but also integrates the contributions of end nodes. Hence, Scop can obtain better performance on accuracy. Experiments on 10 networks compared with six baselines indicate that Scop is remarkably better than others.

[1]  Ramesh R. Sarukkai,et al.  Link prediction and path analysis using Markov chains , 2000, Comput. Networks.

[2]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[3]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[4]  Linyuan Lu,et al.  Link prediction based on local random walk , 2010, 1001.2467.

[5]  Georg Lausen,et al.  SRank: Shortest paths as distance between nodes of a graph with application to RDF clustering , 2013, J. Inf. Sci..

[6]  Alexis Papadimitriou,et al.  Fast and accurate link prediction in social networking systems , 2012, J. Syst. Softw..

[7]  Jun Hong,et al.  Using Markov Chains for Link Prediction in Adaptive Web Sites , 2002, Soft-Ware.

[8]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[9]  Wei Chu,et al.  Stochastic Relational Models for Discriminative Link Prediction , 2006, NIPS.

[10]  Ciro Cattuto,et al.  What's in a crowd? Analysis of face-to-face behavioral networks , 2010, Journal of theoretical biology.

[11]  Lise Getoor,et al.  Combining Collective Classification and Link Prediction , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[12]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[13]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[14]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[15]  Carter T. Butts,et al.  Network inference, error, and informant (in)accuracy: a Bayesian approach , 2003, Soc. Networks.

[16]  H. White,et al.  “Structural Equivalence of Individuals in Social Networks” , 2022, The SAGE Encyclopedia of Research Design.

[17]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[18]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[19]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[20]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Hui Tian,et al.  Predicting missing links via significant paths , 2014, ArXiv.

[23]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[24]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[25]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[26]  Lawrence B. Holder,et al.  Discovering Structural Anomalies in Graph-Based Data , 2007 .

[27]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[28]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[29]  Carlos Melián,et al.  FOOD WEB COHESION , 2004 .

[30]  Kristina Lerman,et al.  Network flows and the link prediction problem , 2013, SNAKDD '13.

[31]  Marko Bajec,et al.  Self-similar scaling of density in complex real-world networks , 2011, ArXiv.

[32]  Pablo M. Gleiser,et al.  Community Structure in Jazz , 2003, Adv. Complex Syst..

[33]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[34]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[35]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[36]  Myriam Lamolle,et al.  Matching of Enhanced XML Schemas with a Measure of Structural-context Similarity , 2007, WEBIST.

[37]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .