Model-free inference of diffusion networks using RKHS embeddings

We revisit in this paper the problem of inferring a diffusion network from information cascades. In our study, we make no assumptions on the underlying diffusion model, in this way obtaining a generic method with broader practical applicability. Our approach exploits the pairwise adoption-time intervals from cascades. Starting from the observation that different kinds of information spread differently, these time intervals are interpreted as samples drawn from unknown (conditional) distributions. In order to statistically distinguish them, we propose a novel method using Reproducing Kernel Hilbert Space embeddings. Experiments on both synthetic and real-world data from Twitter and Flixster show that our method significantly outperforms the state-of-the-art methods. We argue that our algorithm can be implemented by parallel batch processing, in this way meeting the needs in terms of efficiency and scalability of real-world applications.

[1]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[2]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[3]  Brian C. Lovell,et al.  Clustering on Grassmann manifolds via kernel embedding with application to action analysis , 2012, 2012 19th IEEE International Conference on Image Processing.

[4]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[5]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[6]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[7]  Alexander J. Smola,et al.  Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.

[8]  Jon Kleinberg,et al.  Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter , 2011, WWW.

[9]  Bernhard Schölkopf,et al.  Generalized Clustering via Kernel Embeddings , 2009, KI.

[10]  Thomas Villmann,et al.  Some Theoretical Aspects of the Neural Gas Vector Quantizer , 2009, Similarity-Based Clustering.

[11]  Reynold Cheng,et al.  Online Influence Maximization , 2015, KDD.

[12]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[13]  Le Song,et al.  A unified kernel framework for nonparametric inference in graphical models ] Kernel Embeddings of Conditional Distributions , 2013 .

[14]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[15]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[16]  Bernhard Schölkopf,et al.  Causal Discovery via Reproducing Kernel Hilbert Space Embeddings , 2014, Neural Computation.

[17]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[18]  Le Song,et al.  Influence Estimation and Maximization in Continuous-Time Diffusion Networks , 2016, ACM Trans. Inf. Syst..

[19]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[20]  Hong Cheng,et al.  A Model-Free Approach to Infer the Diffusion Network from Event Cascade , 2016, CIKM.

[21]  Martin Ester,et al.  A matrix factorization technique with trust propagation for recommendation in social networks , 2010, RecSys '10.

[22]  Le Song,et al.  Learning Networks of Heterogeneous Influence , 2012, NIPS.

[23]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[24]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[25]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[26]  Laks V. S. Lakshmanan,et al.  Influence Maximization with Bandits , 2015, ArXiv.

[27]  Duncan J. Watts,et al.  Six Degrees: The Science of a Connected Age , 2003 .

[28]  Thomas Villmann,et al.  Similarity-Based Clustering, Recent Developments and Biomedical Applications [outcome of a Dagstuhl Seminar] , 2009, Similarity-Based Clustering.

[29]  Zheng Wen,et al.  Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback , 2016, NIPS.

[30]  Krishna P. Gummadi,et al.  Distinguishing between Topical and Non-Topical Information Diffusion Mechanisms in Social Media , 2016, ICWSM.

[31]  Cheng Soon Ong,et al.  Multivariate spearman's ρ for aggregating ranks using copulas , 2016 .

[32]  Le Song,et al.  Uncover Topic-Sensitive Information Diffusion Networks , 2013, AISTATS.

[33]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[34]  Yajun Wang,et al.  Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms , 2014, J. Mach. Learn. Res..

[35]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[36]  Dit-Yan Yeung,et al.  Relational Deep Learning: A Deep Latent Variable Model for Link Prediction , 2017, AAAI.

[37]  Jure Leskovec,et al.  On the Convexity of Latent Social Network Inference , 2010, NIPS.

[38]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..