Modeling social network relationships via t-cherry junction trees

The massive scale of online social networks makes it very challenging to characterize the underlying structure therein. In this paper, we employ the t-cherry junction tree, a very recent advancement in probabilistic graphical models, to develop a compact representation and good approximation of an otherwise intractable model for users' relationships in a social network. There are a number of advantages in this approach: (1) the best approximation possible via junction trees belongs to the class of t-cherry junction trees; (2) constructing a t-cherry junction tree can be largely parallelized; and (3) inference can be performed using distributed computation. To improve the quality of approximation, we also devise an algorithm to build a higher order tree gracefully from an existing one, without constructing it from scratch. We apply this approach to Twitter data containing 100,000 nodes and study the problem of recommending connections to new users.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  J. Bukszár Upper bounds for the probability of a union by multitrees , 2001 .

[3]  Matthew Rowe,et al.  Who Will Follow Whom? Exploiting Semantics for Link Prediction in Attention-Information Networks , 2012, SEMWEB.

[4]  Alexandru Iosup,et al.  Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[5]  Joseph E. Gonzalez,et al.  GraphLab: A New Parallel Framework for Machine Learning , 2010 .

[6]  Tamás Szántai,et al.  Hypergraphs as a mean of discovering the dependence structure of a discrete multivariate probability distribution , 2012, Ann. Oper. Res..

[7]  Francesco M. Malvestuto,et al.  A backward selection procedure for approximating a discrete probability distribution by decomposable models , 2012, Kybernetika.

[8]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[9]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[10]  Viktor K. Prasanna,et al.  Distributed Evidence Propagation in Junction Trees on Clusters , 2012, IEEE Transactions on Parallel and Distributed Systems.

[11]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[12]  Tamás Szántai,et al.  Discovering a junction tree behind a Markov network by a greedy algorithm , 2011, ArXiv.

[13]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[14]  Xia Wang,et al.  Actively learning to infer social ties , 2012, Data Mining and Knowledge Discovery.

[15]  András Prékopa,et al.  Probability Bounds with Cherry Trees , 2001, Math. Oper. Res..

[16]  Michael I. Jordan,et al.  Efficient Stepwise Selection in Decomposable Models , 2001, UAI.

[17]  Tamás Szántai,et al.  On the Approximation of a Discrete Multivariate Probability Distribution Using the New Concept of t -Cherry Junction Tree , 2010 .