Breaking the Barrier to Transferring Link Information across Networks

Link prediction is one of the most fundamental problems in graph modeling and mining. It has been studied in a wide range of scenarios, from uncovering missing links between different entities in databases, to recommending relations between people in social networks. In this problem, we wish to predict unseen links in a growing target network by exploiting existing structures in source networks. Most of the existing methods often assume that abundant links are available in the target network to build a model for link prediction. However, in many scenarios, the target network may be too sparse to enable robust inference process, which makes link prediction challenging with the paucity of link data. On the other hand, in many cases, other (more densely linked) auxiliary networks can be available that contains similar link structure relevant to that in the target network. The linkage information in the existing networks can be used in conjunction with the node attribute information in both networks in order to make more accurate link recommendations. Thus, this paper proposes the use of learning methods to perform link inference by transferring the link information from the source network to the target network. We also note that the source network may contain the link information irrelevant to the target network. This leads to cross-network bias between the networks, which makes the link model built upon the source network misaligned with the link structure of the target network. Therefore, we re-sample the source network to rectify such cross-network bias by maximizing the cross-network relevance measured by the node attributes, as well as preserving as rich link information as possible to avoid the loss of source link structure caused by the re-sampling algorithm. The link model based on the re-sampled source network can make more accurate link predictions on the target network with aligned link structures across the networks. We present experimental results illustrating the effectiveness of the approach.

[1]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[2]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[3]  Elaine Shi,et al.  Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge , 2011, The 2011 International Joint Conference on Neural Networks.

[4]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[5]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[6]  Lise Getoor,et al.  Combining Collective Classification and Link Prediction , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[7]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[8]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[9]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[10]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[11]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[12]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Janardhan Rao Doppa,et al.  Chance-Constrained Programs for Link Prediction , 2009 .

[14]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[15]  Renée J. Miller,et al.  A framework for semantic link discovery over relational data , 2009, CIKM.

[16]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[17]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[18]  Minghua Chen,et al.  Predicting positive and negative links in signed social networks by transfer learning , 2013, WWW.

[19]  Ben Taskar,et al.  Learning Probabilistic Models of Link Structure , 2003, J. Mach. Learn. Res..

[20]  Lawrence B. Holder,et al.  Discovering Structural Anomalies in Graph-Based Data , 2007 .

[21]  Jianquan Liu,et al.  Link prediction: the power of maximal entropy random walk , 2011, CIKM '11.

[22]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.

[23]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[24]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[25]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[26]  Ben Taskar,et al.  Learning Probabilistic Models of Relational Structure , 2001, ICML.

[27]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[28]  David M. Pennock,et al.  Statistical relational learning for document mining , 2003, Third IEEE International Conference on Data Mining.