Who Started It? Identifying Root Sources in Textual Conversation Threads.

In textual conversation threads, as found on many popular social media platforms, each particular user text comment either originates a new thread of discussion, or replies to a previous comment. An individual who makes an original comment ---termed as the "root source''---is a topic initiator or even an information source, and identifying such individuals is of particular interest. The reply structure of comments is not always available (e.g. in the proliferation of a news event), and thus identifying root sources is a nontrivial task. In this paper, we develop a generative model based on marked multivariate Hawkes processes, and introduce a novel concept, "root source probability", to quantify the uncertainty in attributing possible root sources to each comment. A dynamic-programming-based algorithm is then derived to efficiently compute root source probabilities. Experiments on synthetic and real-world data show that our method identifies root sources that match ground truth and human intuition.

[1]  Yuan Qi,et al.  Content-based Modeling of Reciprocal Relationships using Hawkes and Gaussian Processes , 2016, UAI.

[2]  Katherine A. Heller,et al.  The Bayesian Echo Chamber: Modeling Social Influence via Linguistic Accommodation , 2015, AISTATS.

[3]  A. Hawkes Spectra of some self-exciting and mutually exciting point processes , 1971 .

[4]  Scott W. Linderman,et al.  Discovering Latent Network Structure in Point Process Data , 2014, ICML.

[5]  J. Rasmussen Bayesian Inference for Hawkes Processes , 2013 .

[6]  E. Bacry,et al.  Non-parametric kernel estimation for symmetric Hawkes processes. Application to high frequency financial data , 2011, 1112.1838.

[7]  A. Hawkes Point Spectra of Some Mutually Exciting Point Processes , 1971 .

[8]  Emmanuel Bacry,et al.  Second order statistics characterization of Hawkes processes and non-parametric estimation , 2014, 1401.0903.

[9]  Anna Scaglione,et al.  A Convex Model for Linguistic Influence in Group Conversations , 2016, INTERSPEECH.

[10]  P. Reynaud-Bouret,et al.  Adaptive estimation for Hawkes processes; application to genome analysis , 2009, 0903.2919.

[11]  Erik A. Lewis,et al.  RESEARCH ARTICLE A Nonparametric EM algorithm for Multiscale Hawkes Processes , 2011 .

[12]  Jon M. Kleinberg,et al.  Echoes of power: language effects and power differences in social interaction , 2011, WWW.

[13]  Nicolas Vayatis,et al.  Nonparametric Markovian Learning of Triggering Kernels for Mutually Exciting and Mutually Inhibiting Multivariate Hawkes Processes , 2014, ECML/PKDD.

[14]  Hongyuan Zha,et al.  Learning Granger Causality for Hawkes Processes , 2016, ICML.

[15]  P. Embrechts,et al.  Multivariate Hawkes processes: an application to financial data , 2011, Journal of Applied Probability.

[16]  Katherine A. Heller,et al.  Modelling Reciprocating Relationships with Hawkes Processes , 2012, NIPS.

[17]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[18]  James R. Foulds,et al.  HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based Cascades , 2015, ICML.

[19]  H. Robbins The Empirical Bayes Approach to Statistical Decision Problems , 1964 .

[20]  Shuang-Hong Yang,et al.  Mixture of Mutually Exciting Processes for Viral Diffusion , 2013, ICML.

[21]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..