Multi-scale link prediction

The automated analysis of social networks has become an important problem due to the proliferation of social networks, such as LiveJournal, Flickr and Facebook. The scale of these social networks is massive and continues to grow rapidly. An important problem in social network analysis is proximity estimation that infers the closeness of different users. Link prediction, in turn, is an important application of proximity estimation. However, many methods for computing proximity measures have high computational complexity and are thus prohibitive for large-scale link prediction problems. One way to address this problem is to estimate proximity measures via low-rank approximation. However, a single low-rank approximation may not be sufficient to represent the behavior of the entire network. In this paper, we propose Multi-Scale Link Prediction (MSLP), a framework for link prediction, which can handle massive networks. The basic idea of MSLP is to construct low-rank approximations of the network at multiple scales in an efficient manner. To achieve this, we propose a fast tree-structured approximation algorithm. Based on this approach, MSLP combines predictions at multiple scales to make robust and accurate predictions. Experimental results on real-life datasets with more than a million nodes show the superior performance and scalability of our method.

[1]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[2]  Pang-Ning Tan,et al.  LinkBoost: A Novel Cost-Sensitive Boosting Framework for Community-Level Network Link Prediction , 2011, 2011 IEEE 11th International Conference on Data Mining.

[3]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Yin Zhang,et al.  Clustered embedding of massive social networks , 2012, SIGMETRICS '12.

[5]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.

[6]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[7]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[8]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[9]  Yin Zhang,et al.  Scalable proximity estimation and link prediction in online social networks , 2009, IMC '09.

[10]  Azadeh Iranmehr,et al.  Trust Management for Semantic Web , 2009, 2009 Second International Conference on Computer and Electrical Engineering.

[11]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[12]  David Kempe,et al.  Modularity-maximizing graph communities via mathematical programming , 2007, 0710.2533.

[13]  Inderjit S. Dhillon,et al.  Clustered low rank approximation of graphs in information science applications , 2011, SDM.

[14]  Cecilia Mascolo,et al.  Exploiting place features in link prediction on location-based social networks , 2011, KDD.

[15]  Desmond J. Higham,et al.  Network Properties Revealed through Matrix Functions , 2010, SIAM Rev..

[16]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[17]  Krishna P. Gummadi,et al.  Growth of the flickr social network , 2008, WOSN '08.

[18]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[19]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[20]  Laks V. S. Lakshmanan,et al.  Fast Matrix Computations for Pairwise and Columnwise Commute Times and Katz Scores , 2011, Internet Math..

[21]  Tim Weninger,et al.  Structural Link Analysis from User Profiles and Friends Networks: A Feature Construction Approach , 2007, ICWSM.

[22]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[23]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[24]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.