Link prediction across networks by biased cross-network sampling

The problem of link inference has been widely studied in a variety of social networking scenarios. In this problem, we wish to predict future links in a growing network with the use of the existing network structure. However, most of the existing methods work well only if a significant number of links are already available in the network for the inference process. In many scenarios, the existing network may be too sparse, and may have too few links to enable meaningful learning mechanisms. This paucity of linkage information can be challenging for the link inference problem. However, in many cases, other (more densely linked) networks may be available which show similar linkage structure in terms of underlying attribute information in the nodes. The linkage information in the existing networks can be used in conjunction with the node attribute information in both networks in order to make meaningful link recommendations. Thus, this paper introduces the use of transfer learning methods for performing cross-network link inference. We present experimental results illustrating the effectiveness of the approach.

[1]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Lawrence B. Holder,et al.  Discovering Structural Anomalies in Graph-Based Data , 2007 .

[3]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[4]  Ben Taskar,et al.  Learning Probabilistic Models of Relational Structure , 2001, ICML.

[5]  Janardhan Rao Doppa,et al.  Chance-Constrained Programs for Link Prediction , 2009 .

[6]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[7]  Ben Taskar,et al.  Learning Probabilistic Models of Link Structure , 2003, J. Mach. Learn. Res..

[8]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[9]  Renée J. Miller,et al.  A framework for semantic link discovery over relational data , 2009, CIKM.

[10]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[11]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[12]  Lise Getoor,et al.  Combining Collective Classification and Link Prediction , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[13]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[14]  Hisashi Kashima,et al.  A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction , 2006, Sixth International Conference on Data Mining (ICDM'06).

[15]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[16]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[17]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[18]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[19]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[20]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[21]  David M. Pennock,et al.  Statistical relational learning for document mining , 2003, Third IEEE International Conference on Data Mining.

[22]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.

[23]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[24]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[25]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).