Similarity index based on local paths for link prediction of complex networks.

Predictions of missing links of incomplete networks, such as protein-protein interaction networks or very likely but not yet existent links in evolutionary networks like friendship networks in web society, can be considered as a guideline for further experiments or valuable information for web users. In this paper, we present a local path index to estimate the likelihood of the existence of a link between two nodes. We propose a network model with controllable density and noise strength in generating links, as well as collect data of six real networks. Extensive numerical simulations on both modeled networks and real networks demonstrated the high effectiveness and efficiency of the local path index compared with two well-known and widely used indices: the common neighbors and the Katz index. Indeed, the local path index provides competitively accurate predictions as the Katz index while requires much less CPU time and memory space than the Katz index, which is therefore a strong candidate for potential practical applications in data mining of huge-size networks.

[1]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[2]  Wei Chu,et al.  Stochastic Relational Models for Discriminative Link Prediction , 2006, NIPS.

[3]  Hsinchun Chen,et al.  Link prediction approach to collaborative filtering , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[4]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[5]  Jun Hong,et al.  Using Markov Chains for Link Prediction in Adaptive Web Sites , 2002, Soft-Ware.

[6]  Seymour Geisser,et al.  8. Predictive Inference: An Introduction , 1995 .

[7]  Tao Zhou,et al.  Personal Recommendation in User-Object Networks , 2009, Complex.

[8]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[9]  F. Göbel,et al.  Random walks on graphs , 1974 .

[10]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[11]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[12]  Padhraic Smyth,et al.  Prediction and ranking algorithms for event-based network data , 2005, SKDD.

[13]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[14]  Paul Van Dooren,et al.  A MEASURE OF SIMILARITY BETWEEN GRAPH VERTICES . WITH APPLICATIONS TO SYNONYM EXTRACTION AND WEB SEARCHING , 2002 .

[15]  Ramesh R. Sarukkai,et al.  Link prediction and path analysis using Markov chains , 2000, Comput. Networks.

[16]  Ji Liu,et al.  Link prediction in a user–object network based on time-weighted resource allocation , 2009 .

[17]  P. Holme,et al.  Role-similarity based functional prediction in networked systems: application to the yeast proteome , 2005, Journal of The Royal Society Interface.

[18]  Yi-Cheng Zhang,et al.  Bipartite network projection and personal recommendation. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Sabrina S Wilson Radiology , 1938, Glasgow Medical Journal.

[20]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[21]  K. Reitz,et al.  Graph and Semigroup Homomorphisms on Networks of Relations , 1983 .

[22]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[23]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[24]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[25]  Linyuan Lu,et al.  Role of weak ties in link prediction of complex networks , 2009, CIKM-CNIKM.

[26]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[27]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[28]  Yi-Cheng Zhang,et al.  Heat conduction process on community networks as a recommendation model. , 2007, Physical review letters.

[29]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[30]  Ratul Mahajan,et al.  Measuring ISP topologies with rocketfuel , 2002, TNET.

[31]  Lise Getoor,et al.  Combining Collective Classification and Link Prediction , 2007 .

[32]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[33]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[34]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[35]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[36]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[37]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[38]  Lise Getoor,et al.  Link mining: a new data mining challenge , 2003, SKDD.

[39]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[40]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Tao Li,et al.  Recommendation model based on opinion diffusion , 2007, ArXiv.

[43]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[44]  Christos Faloutsos,et al.  Using ghost edges for classification in sparsely labeled networks , 2008, KDD.

[45]  David D. Jensen,et al.  The case for anomalous link discovery , 2005, SKDD.

[46]  K. N. Dollman,et al.  - 1 , 1743 .

[47]  H. White,et al.  “Structural Equivalence of Individuals in Social Networks” , 2022, The SAGE Encyclopedia of Research Design.

[48]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  B. Wang,et al.  Information filtering based on transferring similarity. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  Gene H. Golub,et al.  Matrix computations , 1983 .

[51]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[52]  David E. Simmen,et al.  IEEE Trans. Knowl. Data Eng., 13(2):298–315, 2001. , 2009 .

[53]  Yi-Cheng Zhang,et al.  Effect of initial configuration on network-based recommendation , 2007, 0711.2506.

[54]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[55]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[56]  Sid Redner,et al.  Networks: Teasing out the missing links , 2008, Nature.