The link-prediction problem for social networks

Given a snapshot of a social network, can we infer which new interactions among its members are likely to occur in the near future? We formalize this question as the link-prediction problem, and we develop approaches to link prediction based on measures for analyzing the “proximity” of nodes in a network. Experiments on large coauthorship networks suggest that information about future interactions can be extracted from network topology alone, and that fairly subtle measures for detecting node proximity can outperform more direct measures. © 2007 Wiley Periodicals, Inc.

[1]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[2]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[3]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[4]  J. Sylvan Katz,et al.  Geographical proximity and scientific collaboration , 1994, Scientometrics.

[5]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[6]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[7]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[8]  Prabhakar Raghavan,et al.  Social Networks: From the Web to the Enterprise , 2002, IEEE Internet Comput..

[9]  Jerrold W. Grossman,et al.  Famous trails to Paul Erdős , 1999 .

[10]  Olle Persson,et al.  Studying research collaboration using co-authorships , 1996, Scientometrics.

[11]  J. S. Katz,et al.  What is research collaboration , 1997 .

[12]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Stefan Bornholdt,et al.  Emergence of a small world from local interactions: modeling acquaintance networks. , 2002, Physical review letters.

[14]  M Girvan,et al.  Structure of growing social networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Volker Steinbiss,et al.  Cooccurrence smoothing for stochastic language modeling , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[17]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[18]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[19]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[20]  Christos Faloutsos,et al.  ANF: a fast and scalable tool for data mining in massive graphs , 2002, KDD.

[21]  Gobinda G. Chowdhury,et al.  A bibliometric analysis of collaboration in the field of Information Retrieval , 1998 .

[22]  Mark Newman,et al.  The structure and function of networks , 2002 .

[23]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[24]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[26]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[27]  Jiawei Han,et al.  Text classification from positive and unlabeled documents , 2003, CIKM '03.

[28]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[29]  Gesine Reinert,et al.  Small worlds , 2001, Random Struct. Algorithms.

[30]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[31]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[32]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[33]  Rajeev Motwani,et al.  Randomized algorithms , 1996, CSUR.

[34]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[35]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[36]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Sepandar D. Kamvar,et al.  An Analytical Comparison of Approaches to Personalizing PageRank , 2003 .