Successful New-entry Prediction for Multi-Party Online Conversations via Latent Topics and Discourse Modeling

With the increasing popularity of social media, online interpersonal communication now plays an essential role in people’s everyday information exchange. Whether and how a newcomer can better engage in the community has attracted great interest due to its application in many scenarios. Although some prior works that explore early socialization have obtained salient achievements, they are focusing on sociological surveys based on the small group. To help individuals get through the early socialization period and engage well in online conversations, we study a novel task to foresee whether a newcomer’s message will be responded to by other participants in a multi-party conversation (henceforth Successful New-entry Prediction). The task would be an important part of the research in online assistants and social media. To further investigate the key factors indicating such engagement success, we employ an unsupervised neural network, Variational Auto-Encoder (VAE), to examine the topic content and discourse behavior from newcomer’s chatting history and conversation’s ongoing context. Furthermore, two large-scale datasets, from Reddit and Twitter, are collected to support further research on new-entries. Extensive experiments on both Twitter and Reddit datasets show that our model significantly outperforms all the baselines and popular neural models. Additional explainable and visual analyses on new-entry behavior shed light on how to better join in others’ discussions.

[1]  Michael R. Lyu,et al.  What Changed Your Mind: The Roles of Dynamic Topics and Discourse in Argumentation Process , 2020, WWW.

[2]  Michael R. Lyu,et al.  What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations , 2019, TACL.

[3]  Rakesh Agrawal,et al.  On participation in group chats on Twitter , 2013, WWW.

[4]  Matthew Rowe,et al.  Mining and comparing engagement dynamics across multiple social media platforms , 2014, WebSci '14.

[5]  Steven Skiena,et al.  The Trumpiest Trump? Identifying a Subject’s Most Characteristic Tweets , 2019, EMNLP.

[6]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[7]  C. Cramton Attribution in distributed work groups. , 2002 .

[8]  Kam-Fai Wong,et al.  Quotation Recommendation and Interpretation Based on Transformation from Queries to Quotations , 2021, ACL.

[9]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[10]  Manju K. Ahuja,et al.  Socialization in Virtual Groups , 2003 .

[11]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[12]  Mari Ostendorf,et al.  Conversation Modeling on Reddit Using a Graph-Structured LSTM , 2017, TACL.

[13]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[14]  Charles A. Sutton,et al.  Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[15]  Ed H. Chi,et al.  Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network , 2010, 2010 IEEE Second International Conference on Social Computing.

[16]  Laku Chidambaram,et al.  Our Virtual World: The Transformation of Work, Play and Life via Technology , 2000 .

[17]  Jon M. Kleinberg,et al.  Characterizing and curating conversation threads: expansion, focus, volume, re-entry , 2013, WSDM.

[18]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[19]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[20]  Alan Ritter,et al.  Unsupervised Modeling of Twitter Conversations , 2010, NAACL.

[21]  Alton Yeow-Kuan Chua,et al.  Beyond knowledge sharing: interactions in online discussion communities , 2013, Int. J. Web Based Communities.

[22]  Matthew Rowe,et al.  Predicting Discussions on the Social Semantic Web , 2011, ESWC.

[23]  James A. Hendler,et al.  The semantic Web -- ISWC 2002 : First International Semantic Web Conference, Sardinia, Italy, June 9-12, 2002 : proceedings , 2002 .

[24]  Carolyn Penstein Rosé,et al.  Talk to me: foundations for successful individual-group interactions in online communities , 2006, CHI.

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[28]  Kam-Fai Wong,et al.  Neural Conversation Recommendation with Online Interaction Modeling , 2019, EMNLP.

[29]  Yue Wang,et al.  Topic-Aware Neural Keyphrase Generation for Social Media Language , 2019, ACL.

[30]  Mari Ostendorf,et al.  A Factored Neural Network Model for Characterizing Online Discussions in Vector Space , 2017, EMNLP.

[31]  Yuki Arase,et al.  Dialogue-Act Prediction of Future Responses Based on Conversation History , 2019, ACL.

[32]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[33]  Robert E. Kraut,et al.  Socializing volunteers in an online community: a field experiment , 2012, CSCW.

[34]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[35]  Kam-Fai Wong,et al.  Continuity of Topic, Interaction, and Query: Learning to Quote in Online Conversations , 2020, EMNLP.

[36]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[37]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[38]  Wenji Mao,et al.  Modeling Conversation Structure and Temporal Dynamics for Jointly Predicting Rumor Stance and Veracity , 2019, EMNLP.

[39]  N. Baym Interpreting Soap Operas and Creating Community: Inside a Computer-Mediated Fan Culture , 1993 .

[40]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[41]  Junghoo Cho,et al.  Modeling a Retweet Network via an Adaptive Bayesian Approach , 2016, WWW.

[42]  The needs and difficulties in socializing the young in contemporary China: Early childhood education experts’ perspectives , 2015 .

[43]  Kam-Fai Wong,et al.  Microblog Conversation Recommendation via Joint Modeling of Topics and Discourse , 2018, NAACL.

[44]  Todd Bodner,et al.  Newcomer adjustment during organizational socialization: a meta-analytic review of antecedents, outcomes, and methods. , 2007, The Journal of applied psychology.

[45]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[46]  R. Kraut,et al.  Membership Claims and Requests: Conversation-Level Newcomer Socialization Strategies in Online Groups , 2010 .

[47]  Philip J. Batterham,et al.  The rate of reply and nature of responses to suicide-related posts on Twitter , 2018, Internet interventions.

[48]  Michael Gamon,et al.  Predicting Responses to Microblog Posts , 2012, NAACL.

[49]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[50]  Brian D. Davison,et al.  Predicting popular messages in Twitter , 2011, WWW.

[51]  Kam-Fai Wong,et al.  Joint Effects of Context and User History for Predicting Online Conversation Re-entries , 2019, ACL.

[52]  Cheng Li,et al.  Find the Conversation Killers: A Predictive Study of Thread-ending Posts , 2017, WWW.

[53]  Phil Blunsom,et al.  Discovering Discrete Latent Topics with Neural Variational Inference , 2017, ICML.