An Evolutionary Context-aware Sequential Model for topic evolution of text stream

Abstract Social media acts as the platform for users to acquire information and spreads out breaking news. The overwhelming amount of fast-growing information makes it a challenge to track the subsequences of the breaking news or events and find the corresponding user opinions towards special aspects. Tracking the evolution of an event and predicting its subsequent trends play an important role in social media. In this paper, we propose an Evolutionary Context-aware Sequential model (ECSM) to track the evolutionary trends of the streaming text and investigate their focused context-aware topics. We integrate two novel layers into the Recurrent Chinese Restaurant Process (RCRP), respectively one context-aware topic layer and one Long Short Term Memory (LSTM) based sequential layer. The context-aware topic layer can help capture the global context-aware semantic coherences and the sequential layer is exploited to learn the local dynamics and semantic dependencies during the dynamic evolutionary process. Experimental results on real datasets show that our method significantly outperforms the state-of-the-art approaches.

[1]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[2]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[3]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[4]  Xiaolong Wang,et al.  Understanding evolution of research themes: a probabilistic generative model for citations , 2013, KDD.

[5]  Alexander J. Smola,et al.  Unified analysis of streaming news , 2011, WWW.

[6]  Chong Wang,et al.  TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency , 2016, ICLR.

[7]  Christopher C. Yang,et al.  TUT: a statistical model for detecting trends, topics and user interests in social media , 2012, CIKM.

[8]  Edwin R. Hancock,et al.  Graph Kernels from the Jensen-Shannon Divergence , 2012, Journal of Mathematical Imaging and Vision.

[9]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[10]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[11]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[12]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[13]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[14]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[15]  Jing Jiang,et al.  Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter , 2014, SDM.

[16]  Ryuichiro Higashinaka,et al.  Trend detection model , 2010, WWW '10.

[17]  Kira Radinsky,et al.  Learning causality for news events prediction , 2012, WWW.

[18]  Noriaki Kawamae,et al.  Trend analysis model: trend consists of temporal words, topics, and timestamps , 2011, WSDM '11.

[19]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[20]  Juan-Zi Li,et al.  What Happens Next? Future Subevent Prediction Using Contextual Hierarchical LSTM , 2017, AAAI.

[21]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[22]  Eric Horvitz,et al.  Mining the web to predict future events , 2013, WSDM.

[23]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Jian Pei,et al.  Detecting topic evolution in scientific literature: how can citations help? , 2009, CIKM.

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..