Entropy-based approach to missing-links prediction

Link-prediction is an active research field within network theory, aiming at uncovering missing connections or predicting the emergence of future relationships from the observed network structure. This paper represents our contribution to the stream of research concerning missing links prediction. Here, we propose an entropy-based method to predict a given percentage of missing links, by identifying them with the most probable non-observed ones. The probability coefficients are computed by solving opportunely defined null-models over the accessible network structure. Upon comparing our likelihood-based, local method with the most popular algorithms over a set of economic, financial and food networks, we find ours to perform best, as pointed out by a number of statistical indicators (e.g. the precision, the area under the ROC curve, etc.). Moreover, the entropy-based formalism adopted in the present paper allows us to straightforwardly extend the link-prediction exercise to directed networks as well, thus overcoming one of the main limitations of current algorithms. The higher accuracy achievable by employing these methods - together with their larger flexibility - makes them strong competitors of available link-prediction algorithms.

[1]  Jing Zhao,et al.  Prediction of Links and Weights in Networks by Reliable Routes , 2015, Scientific Reports.

[2]  Fei Tan,et al.  Link Prediction in Complex Networks: A Mutual Information Perspective , 2014, PloS one.

[3]  D. Garlaschelli,et al.  Maximum likelihood: extracting unbiased information from complex networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Giulia Iori,et al.  Systemic Risk on the Interbank Market , 2004 .

[5]  Daniel Schall Link prediction in directed social networks , 2014, Social Network Analysis and Mining.

[6]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[7]  M. Newman,et al.  Statistical mechanics of networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Linyuan Lü,et al.  Toward link predictability of complex networks , 2015, Proceedings of the National Academy of Sciences.

[9]  Ryutaro Ichise,et al.  Finding Experts by Link Prediction in Co-authorship Networks , 2007, FEWS.

[10]  Mahdi Jalili,et al.  Link prediction in multiplex online social networks , 2017, Royal Society Open Science.

[11]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[12]  Lovekesh Vig,et al.  Improved prediction of missing protein interactome links via anomaly detection , 2017, Applied Network Science.

[13]  Iman van Lelyveld,et al.  Finding the core: Network structure in interbank markets , 2014 .

[14]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[15]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[16]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[17]  A. Barabasi,et al.  Network link prediction by global silencing of indirect correlations , 2013, Nature Biotechnology.

[18]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[19]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[20]  Hao Liao,et al.  Predicting missing links via correlation between nodes , 2014, ArXiv.

[21]  T. Sørensen,et al.  A method of establishing group of equal amplitude in plant sociobiology based on similarity of species content and its application to analyses of the vegetation on Danish commons , 1948 .

[22]  K. Gleditsch,et al.  Expanded Trade and GDP Data , 2002 .

[23]  Diego Garlaschelli,et al.  Analytical maximum-likelihood method to detect patterns in real networks , 2011, 1103.0701.

[24]  Giulio Cimini,et al.  Estimating topological properties of weighted networks from limited information , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[26]  Tao Zhou,et al.  Predicting missing links and identifying spurious links via likelihood analysis , 2016, Scientific Reports.

[27]  Giorgio Fagiolo,et al.  Randomizing world trade. I. A binary network analysis. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[29]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[30]  Nicola Parolini,et al.  Link Prediction in Criminal Networks: A Tool for Criminal Intelligence Analysis , 2016, PloS one.

[31]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[32]  Giorgio Fagiolo,et al.  Randomizing world trade. II. A weighted network analysis. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.