Influence-Based Network-Oblivious Community Detection

How can we detect communities when the social graphs is not available? We tackle this problem by modeling social contagion from a log of user activity, that is a dataset of tuples (u, i, t) recording the fact that user u "adopted" item i at time t. This is the only input to our problem. We propose a stochastic framework which assumes that item adoptions are governed by un underlying diffusion process over the unobserved social network, and that such diffusion model is based on community-level influence. By fitting the model parameters to the user activity log, we learn the community membership and the level of influence of each user in each community. This allows to identify for each community the "key" users, i.e., the leaders which are most likely to influence the rest of the community to adopt a certain item. The general framework can be instantiated with different diffusion models. In this paper we define two models: the extension to the community level of the classic (discrete time) Independent Cascade model, and a model that focuses on the time delay between adoptions. To the best of our knowledge, this is the first work studying community detection without the network.

[1]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[2]  Elisa Lee,et al.  Statistical Methods for Survival Data Analysis: Lee/Survival Data Analysis , 2003 .

[3]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[4]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ravi Kumar,et al.  Influence and correlation in social networks , 2008, KDD.

[6]  Yu Wang,et al.  Community-based greedy algorithm for mining top-K influential nodes in mobile social networks , 2010, KDD.

[7]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[8]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[9]  Nicola Barbieri,et al.  Cascade-based community detection , 2013, WSDM.

[10]  Aristides Gionis,et al.  Sparsification of influence networks , 2011, KDD.

[11]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[12]  Ana L. N. Fred,et al.  Robust data clustering , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[15]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[16]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[17]  Elisa T. Lee,et al.  Statistical Methods for Survival Data Analysis , 1994, IEEE Transactions on Reliability.

[18]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..