Clustering-Based Collaborative Filtering for Link Prediction

In this paper, we propose a novel collaborative filtering approach for predicting the unobserved links in a network (or graph) with both topological and node features. Our approach improves the well-known compressed sensing based matrix completion method by introducing a new multiple-independent-Bernoulli-distribution model as the data sampling mask. It makes better link predictions since the model is more general and better matches the data distributions in many real-world networks, such as social networks like Facebook. As a result, a satisfying stability of the prediction can be guaranteed. To obtain an accurate multiple-independent-Bernoulli-distribution model of the topological feature space, our approach adjusts the sampling of the adjacency matrix of the network (or graph) using the clustering information in the node feature space. This yields a better performance than those methods which simply combine the two types of features. Experimental results on several benchmark datasets suggest that our approach outperforms the best existing link prediction methods.

[1]  Priyanka Agrawal,et al.  Link Label Prediction in Signed Social Networks , 2013, IJCAI.

[2]  Shiqian Ma,et al.  Convergence of Fixed-Point Continuation Algorithms for Matrix Rank Minimization , 2009, Found. Comput. Math..

[3]  Yudong Chen,et al.  Coherent Matrix Completion , 2013, ICML.

[4]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[5]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[6]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[7]  William Stafford Noble,et al.  Learning kernels from biological networks by maximizing entropy , 2004, ISMB/ECCB.

[8]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[9]  Geoffrey J. Gordon,et al.  A Unified View of Matrix Factorization Models , 2008, ECML/PKDD.

[10]  Yoshihiro Yamanishi,et al.  Supervised enzyme network inference from the integration of genomic data and chemical information , 2005, ISMB.

[11]  Nicolas Vayatis,et al.  Estimation of Simultaneously Sparse and Low Rank Matrices , 2012, ICML.

[12]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[13]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[14]  Minghua Chen,et al.  Predicting positive and negative links in signed social networks by transfer learning , 2013, WWW.

[15]  Purnamrita Sarkar,et al.  Nonparametric Link Prediction in Dynamic Networks , 2012, ICML.

[16]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[17]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[18]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[19]  Xiaodong Li,et al.  Stable Principal Component Pursuit , 2010, 2010 IEEE International Symposium on Information Theory.

[20]  Pei Chen,et al.  Optimization Algorithms on Subspaces: Revisiting Missing Data Problem in Low-Rank Matrix , 2008, International Journal of Computer Vision.

[21]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[23]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[24]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[26]  Kim-Chuan Toh,et al.  An Accelerated Proximal Gradient Algorithm for Frame-Based Image Restoration via the Balanced Approach , 2011, SIAM J. Imaging Sci..

[27]  Yu-Xiang Wang,et al.  Stability of matrix factorization for collaborative filtering , 2012, ICML.

[28]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[29]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.