Link Prediction by Utilizing Correlations Between Link Types and Path Types in Heterogeneous Information Networks

Link prediction is a key technique in various applications such as prediction of existence of relationship in biological network. Most existing works focus the link prediction on homogeneous information networks. However, most applications in the real world require heterogeneous information networks that are multiple types of nodes and links. The heterogeneous information network has complex correlation between a type of link and a type of path, which is an important clue for link prediction. In this paper, we propose a method of link prediction in the heterogeneous information network that takes a type correlation into account. We introduce the Local Relatedness Measure (LRM) that indicates possibility of existence of a link between different types of nodes. The correlation between a link type and path type, called TypeCorr is formulated to quantitatively capture the correlation between them. We perform the link prediction based on a supervised learning method, by using features obtained by combining TypeCorr together with other relevant properties. Our experiments show that the proposed method improves accuracy of the link prediction on a real world network.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Philip S. Yu,et al.  Collective Prediction of Multiple Types of Links in Heterogeneous Information Networks , 2014, 2014 IEEE International Conference on Data Mining.

[3]  Jaehoon Choi,et al.  Drug-drug interaction analysis using heterogeneous biological information network , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[4]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[5]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.