Neural Storyline Extraction Model for Storyline Generation from News Articles

Storyline generation aims to extract events described on news articles under a certain news topic and reveal how those events evolve over time. Most approaches to storyline generation first train supervised models to extract events from news articles published in different time periods and then link relevant extracted events into coherent stories. They are domain dependent and cannot deal with unseen event types. To tackle this problem, approaches based on probabilistic graphic models jointly model the generations of events and storylines without the use of annotated data. However, the parameter inference procedure is too complex and models often require long time to converge. In this paper, we propose a novel neural network based approach to extract structured representations and evolution patterns of storylines without using annotated data. In this model, title and main body of a news article are assumed to share the similar storyline distribution. Moreover, similar documents described in neighboring time periods are assumed to share similar storyline distributions. Based on these assumptions, structured representations and evolution patterns of storylines can be extracted. The proposed model has been evaluated on three news corpora and the experimental results show that it outperforms state-of-the-art approaches for storyline generation on both accuracy and efficiency.

[1]  Yan Zhang,et al.  Timeline Generation through Evolutionary Trans-Temporal Summarization , 2011, EMNLP.

[2]  Tie-Yan Liu,et al.  LightLDA: Big Topic Models on Modest Computer Clusters , 2014, WWW.

[3]  Alexander J. Smola,et al.  Online Inference for the Infinite Topic-Cluster Model: Storylines from Streaming Text , 2011, AISTATS.

[4]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[5]  Xin-Yu Dai,et al.  Unsupervised Storyline Extraction from News Articles , 2016, IJCAI.

[6]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[7]  Claire Cardie,et al.  Timeline generation: tracking individuals on twitter , 2013, WWW.

[8]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Alexander J. Smola,et al.  Unified analysis of streaming news , 2011, WWW.

[10]  Jing Jiang,et al.  Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter , 2014, SDM.

[11]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[12]  Hugo Larochelle,et al.  A Neural Autoregressive Topic Model , 2012, NIPS.

[13]  Eric P. Xing,et al.  Scalable Dynamic Nonparametric Bayesian Models of Content and Users , 2013, IJCAI.

[14]  Long Zhu,et al.  A Hybrid Neural Network-Latent Topic Model , 2012, AISTATS.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Deyu Zhou,et al.  An Unsupervised Bayesian Modelling Approach for Storyline Detection on News Articles , 2015, EMNLP.

[17]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[18]  Noriaki Kawamae,et al.  Trend analysis model: trend consists of temporal words, topics, and timestamps , 2011, WSDM '11.

[19]  Min Yang,et al.  Ordering-Sensitive and Semantic-Aware Topic Modeling , 2015, AAAI.

[20]  Heng Ji,et al.  A Novel Neural Topic Model and Its Supervised Extension , 2015, AAAI.

[21]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[22]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[23]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[24]  Lifu Huang,et al.  Optimized Event Storyline Generation based on Mixture-Event-Aspect Model , 2013, EMNLP.

[25]  Pheng-Ann Heng,et al.  The Dynamic Chinese Restaurant Process via Birth and Death Processes , 2015, AAAI.

[26]  Di He,et al.  Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves , 2016, ArXiv.

[27]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.