Predicting missing links via local information

AbstractMissing link prediction in networks is of both theoretical interest and practical significance in modern science. In this paper, we empirically investigate a simple framework of link prediction on the basis of node similarity. We compare nine well-known local similarity measures on six real networks. The results indicate that the simplest measure, namely Common Neighbours, has the best overall performance, and the Adamic-Adar index performs second best. A new similarity measure, motivated by the resource allocation process taking place on networks, is proposed and shown to have higher prediction accuracy than common neighbours. It is found that many links are assigned the same scores if only the information of the nearest neighbours is used. We therefore design another new measure exploiting information on the next nearest neighbours, which can remarkably enhance the prediction accuracy.

[1]  Bradford A. Hawkins,et al.  EFFECTS OF SAMPLING EFFORT ON CHARACTERIZATION OF FOOD-WEB STRUCTURE , 1999 .

[2]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[3]  Bao-qun Yin,et al.  Power-law strength-degree correlation from resource-allocation dynamics on weighted networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[5]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[6]  Shi Zhou,et al.  The rich-club phenomenon in the Internet topology , 2003, IEEE Communications Letters.

[7]  W. Li,et al.  Statistical analysis of airport network of China. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[9]  Alessandro Vespignani,et al.  Detecting rich-club ordering in complex networks , 2006, physics/0602134.

[10]  F. Göbel,et al.  Random walks on graphs , 1974 .

[11]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[12]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Haibo Hu,et al.  Disassortative mixing in online social networks , 2009, 0909.0450.

[14]  Shi Zhou,et al.  Structural constraints in complex networks , 2007, physics/0702096.

[15]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[16]  Michael T. Gastner,et al.  The spatial structure of networks , 2006 .

[17]  Pavel Yu. Chebotarev,et al.  The Matrix-Forest Theorem and Measuring Relations in Small Social Groups , 2006, ArXiv.

[18]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[19]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[20]  V Latora,et al.  Efficient behavior of small-world networks. , 2001, Physical review letters.

[21]  N. U. Prabhu,et al.  Stochastic Processes and Their Applications , 1999 .

[22]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[23]  Tao Zhou,et al.  MODELLING COLLABORATION NETWORKS BASED ON NONLINEAR PREFERENTIAL ATTACHMENT , 2007 .

[24]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[25]  Sid Redner,et al.  Networks: Teasing out the missing links , 2008, Nature.

[26]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[27]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[28]  Tao Zhou,et al.  Scale-free networks without growth , 2008 .

[29]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  A Grabowski,et al.  Dynamic phenomena and human activity in an artificial society. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Hsinchun Chen,et al.  Link prediction approach to collaborative filtering , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[32]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[33]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Beom Jun Kim,et al.  Attack vulnerability of complex networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[36]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[37]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[38]  Tao Zhou,et al.  Phase synchronization on scale-free networks with community structure , 2007 .

[39]  SmythPadhraic,et al.  Prediction and ranking algorithms for event-based network data , 2005 .

[40]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[41]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[42]  Ji Liu,et al.  Link prediction in a user–object network based on time-weighted resource allocation , 2009 .

[43]  S. Goldhor Ecology , 1964, The Yale Journal of Biology and Medicine.

[44]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[45]  Yi-Cheng Zhang,et al.  Bipartite network projection and personal recommendation. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[46]  Sabrina S Wilson Radiology , 1938, Glasgow Medical Journal.

[47]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[48]  Yi-Cheng Zhang,et al.  Effect of initial configuration on network-based recommendation , 2007, 0711.2506.

[49]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[50]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[51]  Guo-Jie Li,et al.  Enhancing the transmission efficiency by edge deletion in scale-free networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[52]  Jian-Guo Liu,et al.  Detecting community structure in complex networks via node similarity , 2010 .

[53]  Ney Lemke,et al.  Damage, connectivity and essentiality in protein–protein interaction networks , 2005 .

[54]  Louis Weinberg,et al.  Automation and Remote Control , 1957 .

[55]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[56]  Hawoong Jeong,et al.  Modeling the Internet's large-scale topology , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Alex Arenas,et al.  Synchronization reveals topological scales in complex networks. , 2006, Physical review letters.

[58]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[59]  G. J. Rodgers,et al.  Traffic on complex networks: Towards understanding global statistical properties from microscopic density fluctuations. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[60]  Lise Getoor,et al.  Learning statistical models from relational data , 2011, SIGMOD '11.

[61]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[62]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[63]  Ratul Mahajan,et al.  Measuring ISP topologies with rocketfuel , 2002, TNET.

[64]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[65]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[66]  B. Wang,et al.  Information filtering based on transferring similarity. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[67]  Bing-Hong Wang,et al.  Decoupling process for better synchronizability on scale-free networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[68]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[69]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[70]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.