When will it happen?: relationship prediction in heterogeneous information networks

Link prediction, i.e., predicting links or interactions between objects in a network, is an important task in network analysis. Although the problem has attracted much attention recently, there are several challenges that have not been addressed so far. First, most existing studies focus only on link prediction in homogeneous networks, where all objects and links belong to the same type. However, in the real world, heterogeneous networks that consist of multi-typed objects and relationships are ubiquitous. Second, most current studies only concern the problem of whether a link will appear in the future but seldom pay attention to the problem of when it will happen. In this paper, we address both issues and study the problem of predicting when a certain relationship will happen in the scenario of heterogeneous networks. First, we extend the link prediction problem to the relationship prediction problem, by systematically defining both the target relation and the topological features, using a meta path-based approach. Then, we directly model the distribution of relationship building time with the use of the extracted topological features. The experiments on citation relationship prediction between authors on the DBLP network demonstrate the effectiveness of our methodology.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[3]  Zan Huang,et al.  The Time-Series Link Prediction Problem with Applications in Communication Surveillance , 2009, INFORMS J. Comput..

[4]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[5]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[7]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[8]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[9]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[10]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[11]  Francesco Bonchi,et al.  Cold start link prediction , 2010, KDD.

[12]  A. Dobson An Introduction to Generalized Linear Models, Second Edition , 2001 .

[13]  Srikanta J. Bedathur,et al.  Towards time-aware link prediction in evolving social networks , 2009, SNA-KDD '09.

[14]  David Lo,et al.  Mining interesting link formation rules in social networks , 2010, CIKM.

[15]  Aristides Gionis,et al.  Learning and Predicting the Evolution of Social Networks , 2010, IEEE Intelligent Systems.

[16]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[17]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[18]  Annette J. Dobson,et al.  An introduction to generalized linear models , 1991 .

[19]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[20]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[21]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[22]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .