Supervised Machine Learning Applied to Link Prediction in Bipartite Social Networks

This work copes with the problem of link prediction in large-scale two-mode social networks. Two variations of the link prediction tasks are studied: predicting links in a bipartite graph and predicting links in a unimodal graph obtained by the projection of a bipartite graph over one of its node sets. For both tasks, we show in an empirical way, that taking into account the bipartite nature of the graph can enhance substantially the performances of prediction models we learn. This is achieved by introducing new variations of topological atttributes to measure the likelihood of two nodes to be connected. Our approach, for both tasks, consists in expressing the link prediction problem as a two class discrimination problem. Classical supervised machine learning approaches can then be applied in order to learn prediction models. Experimental validation of the proposed approach is carried out on two real data sets: a co-authoring network extracted from the DBLP bibliographical database and bipartite graph history of transactions on an on-line music e-commerce site.

[1]  Martin G. Everett,et al.  Network analysis of 2-mode data , 1997 .

[2]  Ryutaro Ichise,et al.  Finding Experts by Link Prediction in Co-authorship Networks , 2007, FEWS.

[3]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[4]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[5]  Richard Jeremy Edwin Cooke,et al.  Link prediction and link detection in sequences of large social networks using temporal and local metrics , 2006 .

[6]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[7]  Tanya Y. Berger-Wolf,et al.  Structure Prediction in Temporal Networks using Frequent Subgraphs , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[8]  Tsuyoshi Murata,et al.  Link Prediction based on Structural Properties of Online Social Networks , 2008, New Generation Computing.

[9]  John Scott What is social network analysis , 2010 .

[10]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[11]  Aristides Gionis,et al.  Mining Graph Evolution Rules , 2009, ECML/PKDD.

[12]  David Liben-Nowell,et al.  An algorithmic approach to social networks , 2005 .

[13]  Lyle H. Ungar,et al.  Structural Logistic Regression for Link Analysis , 2003 .

[14]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[15]  Stephen P. Borgatti,et al.  Social Network Analysis, Two-Mode Concepts in , 2009, Encyclopedia of Complexity and Systems Science.

[16]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[17]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[18]  M. Newman,et al.  Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Lada A. Adamic,et al.  A social network caught in the Web , 2003, First Monday.

[20]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[21]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[22]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[23]  Yoshihiro Yamanishi,et al.  Supervised Bipartite Graph Inference , 2008, NIPS.

[24]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[25]  M. Newman Coauthorship networks and patterns of scientific collaboration , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Hsinchun Chen,et al.  Link prediction approach to collaborative filtering , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[27]  Srikanta J. Bedathur,et al.  Towards time-aware link prediction in evolving social networks , 2009, SNA-KDD '09.