Multi-relational Link Prediction in Heterogeneous Information Networks

Many important real-world systems, modeled naturally as complex networks, have heterogeneous interactions and complicated dependency structures. Link prediction in such networks must model the influences between heterogenous relationships and distinguish the formation mechanisms of each link type, a task which is beyond the simple topological features commonly used to score potential links. In this paper, we introduce a novel probabilistically weighted extension of the Adamic/Adar measure for heterogenous information networks, which we use to demonstrate the potential benefits of diverse evidence, particularly in cases where homogeneous relationships are very sparse. However, we also expose some fundamental flaws of traditional a priori link prediction. In accordance with previous research on homogeneous networks, we further demonstrate that a supervised approach to link prediction can enhance performance and is easily extended to the heterogeneous case. Finally, we present results on three diverse, real-world heterogeneous information networks and discuss the trends and tradeoffs of supervised and unsupervised link prediction in a multi-relational setting.

[1]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[2]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[3]  P. Radivojac,et al.  An integrated approach to inferring gene–disease associations in humans , 2008, Proteins.

[4]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[5]  Igor Jurisica,et al.  Modeling interactome: scale-free or geometric? , 2004, Bioinform..

[6]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Jiawei Han,et al.  Mining Heterogeneous Information Networks by Exploring the Power of Links , 2009, ALT.

[8]  Padhraic Smyth,et al.  Prediction and ranking algorithms for event-based network data , 2005, SKDD.

[9]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[10]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Nitesh V. Chawla,et al.  Empirical comparison of correlation measures and pruning levels in complex networks representing the global climate system , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[14]  Noah E. Friedkin,et al.  Social influence and opinions , 1990 .

[15]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[16]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[17]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[18]  N. Christakis,et al.  SUPPLEMENTARY ONLINE MATERIAL FOR: The Collective Dynamics of Smoking in a Large Social Network , 2022 .

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[21]  P. Holland,et al.  A Method for Detecting Structure in Sociometric Data , 1970, American Journal of Sociology.

[22]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[23]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[24]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[25]  Huan Liu,et al.  Uncoverning Groups via Heterogeneous Interaction Analysis , 2009, 2009 Ninth IEEE International Conference on Data Mining.