Contrastive Training for Models of Information Cascades

This paper proposes a model of information cascades as directed spanning trees (DSTs) over observed documents. In addition, we propose a contrastive training procedure that exploits partial temporal ordering of node infections in lieu of labeled training links. This combination of model and unsupervised training makes it possible to improve on models that use infection times alone and to exploit arbitrary features of the nodes and of the text content of messages in information cascades. With only basic node and time lag features similar to previous models, the DST model achieves performance with unsupervised training comparable to strong baselines on a blog network inference task. Unsupervised training with additional content features achieves significantly better results, reaching half the accuracy of a fully supervised model.

[1]  Dan Klein,et al.  Structured Learning for Taxonomy Induction with Belief Propagation , 2014, ACL.

[2]  Michael Kearns,et al.  Learning from Contagion (Without Timestamps) , 2014, ICML.

[3]  Justin Grimmer,et al.  Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Bernhard Schölkopf,et al.  Uncovering the structure and temporal dynamics of information propagation , 2014, Network Science.

[6]  Stefano Ermon,et al.  Feature-Enhanced Probabilistic Models for Diffusion Network Inference , 2012, ECML/PKDD.

[7]  Giorgio Satta,et al.  On the Complexity of Non-Projective Data-Driven Dependency Parsing , 2007, IWPT.

[8]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[9]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[10]  Jure Leskovec,et al.  On the Convexity of Latent Social Network Inference , 2010, NIPS.

[11]  Scott W. Linderman,et al.  Discovering Latent Network Structure in Point Process Data , 2014, ICML.

[12]  Bernhard Scholkopf,et al.  Submodular Inference of Diffusion Networks from Multiple Trees , 2012, ICML.

[13]  Alessandro Panconesi,et al.  Trace complexity of network inference , 2013, KDD.

[14]  Xavier Carreras,et al.  Structured Prediction Models via the Matrix-Tree Theorem , 2007, EMNLP.

[15]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[16]  Noah A. Smith,et al.  Probabilistic Models of Nonprojective Dependency Trees , 2007, EMNLP.

[17]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[18]  Le Song,et al.  Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm , 2014, ICML.

[19]  Weili Wu,et al.  Cascade source inference in networks: a Markov chain Monte Carlo approach , 2015, Computational social networks.

[20]  Shweta Bansal,et al.  Inferring population-level contact heterogeneity from common epidemic data , 2013, Journal of The Royal Society Interface.

[21]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[22]  Alain Barrat,et al.  Contact Patterns in a High School: A Comparison between Data Collected Using Wearable Sensors, Contact Diaries and Friendship Surveys , 2015, PloS one.

[23]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[24]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[25]  Nello Cristianini,et al.  Refining causality: who copied from whom? , 2011, KDD.

[26]  Yizhou Sun,et al.  Modeling Topic Diffusion in Multi-Relational Bibliographic Information Networks , 2014, CIKM.

[27]  Bernhard Schölkopf,et al.  Modeling Information Propagation with Survival Theory , 2013, ICML.

[28]  Noah A. Smith,et al.  Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[29]  Hong Cheng,et al.  A Model-Free Approach to Infer the Diffusion Network from Event Cascade , 2016, CIKM.

[30]  Tanya Y. Berger-Wolf,et al.  Network Structure Inference, A Survey: Motivations, Methods, and Applications , 2016 .