Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time

Dynamic topic modeling facilitates the identification of topical trends over time in temporal collections of unstructured documents. We introduce a novel unsupervised neural dynamic topic model named as Recurrent Neural Network-Replicated Softmax Model (RNNRSM), where the discovered topics at each time influence the topic discovery in the subsequent time steps. We account for the temporal ordering of documents by explicitly modeling a joint distribution of latent topical dependencies over time, using distributional estimators with temporal recurrent connections. Applying RNN-RSM to 19 years of articles on NLP research, we demonstrate that compared to state-of-the art topic models, RNNRSM shows better generalization, topic interpretation, evolution and trends. We also introduce a metric (named as SPAN) to quantify the capability of dynamic topic model to capture word evolution in topics over time.

[1]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[2]  Mingyuan Zhou,et al.  Poisson-Gamma dynamical systems , 2016, NIPS.

[3]  Stephan Mandt,et al.  Dynamic Word Embeddings , 2017, ICML.

[4]  J. Aitchison Language Change: Progress or Decay? , 1981 .

[5]  Lawrence Carin,et al.  Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[7]  Ngoc Thang Vu,et al.  Combining Recurrent and Convolutional Neural Networks for Relation Classification , 2016, NAACL.

[8]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[9]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[10]  Xiaoli Li,et al.  EMNLP versus ACL: Analyzing NLP research over time , 2015, EMNLP.

[11]  Mark Stevenson,et al.  Evaluating Topic Coherence Using Distributional Semantics , 2013, IWCS.

[12]  Daniel Jurafsky,et al.  Studying the History of Ideas Using Topic Models , 2008, EMNLP.

[13]  Xiaojun Wan,et al.  Single Document Keyphrase Extraction Using Neighborhood Knowledge , 2008, AAAI.

[14]  Rajarshi Das,et al.  Gaussian LDA for Topic Models with Word Embeddings , 2015, ACL.

[15]  David Newman,et al.  External evaluation of topic models , 2009 .

[16]  Ngoc Thang Vu,et al.  Bi-directional recurrent neural network with ranking loss for spoken language understanding , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[18]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[19]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[20]  David M. Blei,et al.  Dynamic Bernoulli Embeddings for Language Evolution , 2017, ArXiv.

[21]  Yasushi Sakurai,et al.  Online multiscale dynamic topic models , 2010, KDD.

[22]  Peter V. Gehler,et al.  The rate adapting poisson model for information retrieval and object recognition , 2006, ICML.

[23]  Vikas Sindhwani,et al.  Learning evolving and emerging topics in social media: a dynamic nmf approach with temporal regularization , 2012, WSDM '12.

[24]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[25]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[27]  Hinrich Schütze,et al.  Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction , 2016, COLING.

[28]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.

[29]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[30]  James Allan,et al.  Introduction to topic detection and tracking , 2002 .

[31]  Geoffrey E. Hinton,et al.  Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[32]  Rong Yan,et al.  Mining Associated Text and Images with Dual-Wing Harmoniums , 2005, UAI.

[33]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[34]  Geoffrey E. Hinton,et al.  Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[35]  Hui Xiong,et al.  Dynamic Word Embeddings for Evolving Semantic Discovery , 2017, WSDM.