Supervised Link Prediction Using Multiple Sources

Link prediction is a fundamental problem in social network analysis and modern-day commercial applications such as Face book and My space. Most existing research approaches this problem by exploring the topological structure of a social network using only one source of information. However, in many application domains, in addition to the social network of interest, there are a number of auxiliary social networks and/or derived proximity networks available. The contribution of the paper is twofold: (1) a supervised learning framework that can effectively and efficiently learn the dynamics of social networks in the presence of auxiliary networks, (2) a feature design scheme for constructing a rich variety of path-based features using multiple sources, and an effective feature selection strategy based on structured sparsity. Extensive experiments on three real-world collaboration networks show that our model can effectively learn to predict new links using multiple sources, yielding higher prediction accuracy than unsupervised and single-source supervised models.

[1]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[2]  Frank Harary,et al.  Graph Theory , 2016 .

[3]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[4]  Tom A. B. Snijders,et al.  Introduction to stochastic actor-based models for network dynamics , 2010, Soc. Networks.

[5]  Yin Zhang,et al.  Scalable proximity estimation and link prediction in online social networks , 2009, IMC '09.

[6]  D. Lazer,et al.  Inferring Social Network Structure using Mobile Phone Data , 2006 .

[7]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.

[8]  Hisashi Kashima,et al.  A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction , 2006, Sixth International Conference on Data Mining (ICDM'06).

[9]  Tom A. B. Snijders,et al.  Statistical Methods for Network Dynamics , 2006 .

[10]  P. Pattison,et al.  Random graph models for temporal processes in social networks , 2001 .

[11]  Wenjie Fu,et al.  Recovering temporally rewiring networks: a model-based approach , 2007, ICML '07.

[12]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[13]  Zan Huang Link Prediction Based on Graph Topology: The Predictive Value of Generalized Clustering Coefficient , 2010 .

[14]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[15]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[16]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[17]  David Lazer,et al.  Inferring friendship network structure by using mobile phone data , 2009, Proceedings of the National Academy of Sciences.

[18]  Yoshihiro Yamanishi,et al.  propagation: A fast semisupervised learning algorithm for link prediction , 2009 .

[19]  E. Xing,et al.  Discrete Temporal Models of Social Networks , 2006, SNA@ICML.

[20]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[21]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[22]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[23]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[24]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[25]  Jérôme Kunegis,et al.  Learning spectral graph transformations for link prediction , 2009, ICML '09.

[26]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  L. Katz,et al.  The concept of configuration of interpersonal relations in a group as a time-dependent stochastic process , 1959 .

[28]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[29]  Francis R. Bach,et al.  Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning , 2008, NIPS.

[30]  P. Zhao,et al.  Grouped and Hierarchical Model Selection through Composite Absolute Penalties , 2007 .