Link Prediction for Egocentrically Sampled Networks

Link prediction in networks is typically accomplished by estimating or ranking the probabilities of edges for all pairs of nodes. In practice, especially for social networks, the data are often collected by egocentric sampling, which means selecting a subset of nodes and recording all of their edges. This sampling mechanism requires different prediction tools than the typical assumption of links missing at random. We propose a new computationally efficient link prediction algorithm for egocentrically sampled networks, which estimates the underlying probability matrix by estimating its row space. For networks created by sampling rows, our method outperforms many popular link prediction and graphon estimation techniques.

[1]  Petros Drineas,et al.  Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition , 2006, SIAM J. Comput..

[2]  Cynthia M. Webster,et al.  Exploring social structure using dynamic three-dimensional color images , 1998 .

[3]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[4]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[5]  E. Levina,et al.  Estimating network edge probabilities by neighborhood smoothing , 2015, 1509.08588.

[6]  Ewout van den Berg,et al.  1-Bit Matrix Completion , 2012, ArXiv.

[7]  Carey E. Priebe,et al.  Two-sample Hypothesis Testing for Random Dot Product Graphs via Adjacency Spectral Embedding , 2014 .

[8]  Peter D. Hoff,et al.  Modeling homophily and stochastic equivalence in symmetric relational data , 2007, NIPS.

[9]  Edward R. Scheinerman,et al.  Random Dot Product Graph Models for Social Networks , 2007, WAW.

[10]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[11]  Mark E. J. Newman,et al.  Ego-centered networks and the ripple effect , 2001, Soc. Networks.

[12]  Linton C. Freeman,et al.  Centered graphs and the structure of ego networks , 1982, Math. Soc. Sci..

[13]  James Moody,et al.  Peer influence groups: identifying dense clusters in large networks , 2001, Soc. Networks.

[14]  Zack W. Almquist Random errors in egocentric networks , 2012, Soc. Networks.

[15]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[16]  Jure Leskovec,et al.  Governance in Social Media: A Case Study of the Wikipedia Promotion Process , 2010, ICWSM.

[17]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[18]  Amin Vahdat,et al.  Hyperbolic Geometry of Complex Networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  S. Chatterjee,et al.  Matrix estimation by Universal Singular Value Thresholding , 2012, 1212.1247.

[20]  Bhaskar DasGupta,et al.  Topological implications of negative curvature for biological and social networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[22]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[23]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[24]  Can M. Le,et al.  Concentration and regularization of random graphs , 2015, Random Struct. Algorithms.

[25]  Anuska Ferligoj,et al.  Effects on reliability and validity of egocentered network measurements , 2005, Soc. Networks.

[26]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[27]  Peter V. Marsden,et al.  Egocentric and sociocentric measures of network centrality , 2002, Soc. Networks.

[28]  Ji Zhu,et al.  Link Prediction for Partially Observed Networks , 2013, ArXiv.

[29]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[30]  Andrei Z. Broder,et al.  Workshop on Algorithms and Models for the Web Graph , 2007, WAW.

[31]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[32]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[33]  C. Priebe,et al.  A nonparametric two-sample hypothesis testing problem for random dot product graphs , 2014, 1409.2344.

[34]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[35]  Carey E. Priebe,et al.  A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs , 2011, 1108.2228.