Continuous Similarity Learning with Shared Neural Semantic Representation for Joint Event Detection and Evolution

In the era of the rapid development of today's Internet, people often feel overwhelmed by vast official news streams or unofficial self-media tweets. To help people obtain the news topics they care about, there is a growing need for systems that can extract important events from this amount of data and construct the evolution procedure of events logically into a story. Most existing methods treat event detection and evolution as two independent subtasks under an integrated pipeline setting. However, the interdependence between these two subtasks is often ignored, which leads to a biased propagation. Besides, due to the limitations of news documents' semantic representation, the performance of event detection and evolution is still limited. To tackle these problems, we propose a Joint Event Detection and Evolution (JEDE) model, to detect events and discover the event evolution relationships from news streams in this paper. Specifically, the proposed JEDE model is built upon the Siamese network, which first introduces the bidirectional GRU attention network to learn the vector-based semantic representation for news documents shared across two subtask networks. Then, two continuous similarity metrics are learned using stacked neural networks to judge whether two news documents are related to the same event or two events are related to the same story. Furthermore, due to the limited available dataset with ground truths, we make efforts to construct a new dataset, named EDENS, which contains valid labels of events and stories. The experimental results on this newly created dataset demonstrate that, thanks to the shared representation and joint training, the proposed model consistently achieves significant improvements over the baseline methods.

[1]  Jakub Piskorski,et al.  Real-Time News Event Extraction for Global Crisis Monitoring , 2008, NLDB.

[2]  Kapil Shah A Review: Rumors Detection on Twitter Using Machine Learning Techniques , 2020 .

[3]  Xiong Li,et al.  Event detection and evolution in multi-lingual social streams , 2020, Frontiers of Computer Science.

[4]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[5]  Ramesh Nallapati,et al.  Event threading within news topics , 2004, CIKM '04.

[6]  Christopher D. Manning,et al.  Joint Parsing and Named Entity Recognition , 2009, NAACL.

[7]  Duc-Thuan Vo,et al.  Exploiting Language Models to Classify Events from Twitter , 2015, Comput. Intell. Neurosci..

[8]  Joemon M. Jose,et al.  Building a large-scale corpus for evaluating event detection on twitter , 2013, CIKM.

[9]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[10]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[11]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[12]  Dafna Shahaf,et al.  Information cartography: creating zoomable, large-scale maps of information , 2013, KDD.

[13]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[14]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[15]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[16]  Muhammad Abulaish,et al.  A Graph-Theoretic Embedding-Based Approach for Rumor Detection in Twitter , 2019, 2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[17]  Zhenhua Wang,et al.  On Summarization and Timeline Generation for Evolutionary Tweet Streams , 2015, IEEE Transactions on Knowledge and Data Engineering.

[18]  Deyu Zhou,et al.  Jointly Event Extraction and Visualization on Twitter via Probabilistic Modelling , 2016, ACL.

[19]  Yan Zhang,et al.  Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction , 2013, EMNLP.

[20]  Yue Zhang,et al.  Neural Network for Heterogeneous Annotations , 2016, EMNLP.

[21]  Qian Li,et al.  Rumor propagation dynamic model based on evolutionary game and anti-rumor , 2018, Nonlinear Dynamics.

[22]  Chih-Ping Wei,et al.  Discovering Event Evolution Patterns From Document Sequences , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[23]  Qicai Wang,et al.  A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning , 2019, Applied Sciences.

[24]  Yue Zhang,et al.  A Neural Model for Joint Event Detection and Summarization , 2017, IJCAI.

[25]  Hujun Bao,et al.  Laplacian Regularized Gaussian Mixture Model for Data Clustering , 2011, IEEE Transactions on Knowledge and Data Engineering.

[26]  Zhen Cao,et al.  EDM-JBW: A novel event detection model based on JS-ID′Forder and Bikmeans with word embedding for news streams , 2018, J. Comput. Sci..

[27]  Bin Zhang,et al.  Adaptive online event detection in news streams , 2017, Knowl. Based Syst..

[28]  Heng Ji,et al.  Incremental Joint Extraction of Entity Mentions and Relations , 2014, ACL.

[29]  Ping He,et al.  Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[30]  Claire Cardie,et al.  Socially-Informed Timeline Generation for Complex Events , 2015, HLT-NAACL.

[31]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[32]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[33]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[34]  Wenji Mao,et al.  Joint Learning with Keyword Extraction for Event Detection in Social Media , 2018, 2018 IEEE International Conference on Intelligence and Security Informatics (ISI).

[35]  Linmei Hu,et al.  A neural model for joint event detection and prediction , 2020, Neurocomputing.

[36]  Naftali Tishby,et al.  Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[37]  Charles L. A. Clarke,et al.  Term proximity scoring for ad-hoc retrieval on very large text collections , 2006, SIGIR.

[38]  Wenji Mao,et al.  Online event detection and tracking in social media based on neural similarity metric learning , 2017, 2017 IEEE International Conference on Intelligence and Security Informatics (ISI).

[39]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR.

[40]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[41]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[42]  Timothy Baldwin,et al.  On-line Trend Analysis with Topic Models: #twitter Trends Detection Topic Model Online , 2012, COLING.

[43]  Guodong Wang,et al.  Chinese Emergency Event Recognition Using Conv-RDBiGRU Model , 2020, Comput. Intell. Neurosci..

[44]  Linglong Kong,et al.  Story Forest , 2020, ACM Trans. Knowl. Discov. Data.

[45]  Chih-Ping Wei,et al.  Discovering Event Evolution Graphs From News Corpora , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[46]  Miles Osborne,et al.  Twitter-scale New Event Detection via K-term Hashing , 2015, EMNLP.

[47]  Jugal K. Kalita,et al.  Streaming trend detection in Twitter , 2013, Int. J. Web Based Communities.

[48]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[49]  Yanbing Liu,et al.  A Rumor & Anti-Rumor Propagation Model Based on Data Enhancement and Evolutionary Game , 2020, IEEE Transactions on Emerging Topics in Computing.

[50]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[51]  Yunpeng Xiao,et al.  Rumor Diffusion Model Based on Representation Learning and Anti-Rumor , 2020, IEEE Transactions on Network and Service Management.

[52]  Tao Wang,et al.  Event Relationship Analysis for Temporal Event Search , 2013, DASFAA.