Automatic Event Salience Identification

Identifying the salience (i.e. importance) of discourse units is an important task in language understanding. While events play important roles in text documents, little research exists on analyzing their saliency status. This paper empirically studies Event Salience and proposes two salience detection models based on discourse relations. The first is a feature based salience model that incorporates cohesion among discourse units. The second is a neural model that captures more complex interactions between discourse units. In our new large-scale event salience corpus, both methods significantly outperform the strong frequency baseline, while our neural model further improves the feature based one by a large margin. Our analyses demonstrate that our neural model captures interesting connections between salience and discourse unit relations (e.g., scripts and frame structures).

[1]  Dan Roth,et al.  Joint Inference for Event Timeline Construction , 2012, EMNLP.

[2]  Tie-Yan Liu,et al.  Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling , 2018, SIGIR.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[5]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[6]  Daniel Marcu,et al.  Discourse Trees Are Good Indicators of Importance in Text , 1999 .

[7]  Taylor Cassidy,et al.  Dense Event Ordering with a Multi-Pass Architecture , 2014, TACL.

[8]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9]  Jackie Chi Kit Cheung,et al.  Probabilistic Frame Induction , 2013, NAACL.

[10]  Teruko Mitamura,et al.  Supervised Within-Document Event Coreference using Information Propagation , 2014, LREC.

[11]  Paolo Ferragina,et al.  Swat: A system for detecting salient Wikipedia entities in texts , 2018, Comput. Intell..

[12]  Daniel Gillick,et al.  A New Entity Salience Task with Millions of Training Examples , 2014, EACL.

[13]  Ruihong Huang,et al.  Identifying the Most Dominant Event in a News Article by Mining Event Coreference Relations , 2018, NAACL-HLT.

[14]  Zhiyuan Liu,et al.  End-to-End Neural Ad-hoc Ranking with Kernel Pooling , 2017, SIGIR.

[15]  Roger C. Schank,et al.  SCRIPTS, PLANS, GOALS, AND UNDERSTANDING , 1988 .

[16]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Event Chains , 2008, ACL.

[17]  Dan Roth,et al.  Event Detection and Co-reference with Minimal Supervision , 2016, EMNLP.

[18]  Raymond J. Mooney,et al.  Using Sentence-Level LSTM Language Models for Script Inference , 2016, ACL.

[19]  Jing Lu,et al.  Joint Learning for Event Coreference Resolution , 2017, ACL.

[20]  Teruko Mitamura,et al.  Overview of TAC KBP 2015 Event Nugget Track , 2015, TAC.

[21]  Noah A. Smith,et al.  Semi-Supervised Frame-Semantic Parsing for Unknown Predicates , 2011, ACL.

[22]  Daniel S. Weld,et al.  Exploiting Parallel News Streams for Unsupervised Event Extraction , 2015, TACL.

[23]  Tomas Vitvar,et al.  Crowdsourced Corpus with Entity Salience Annotations , 2016, LREC.

[24]  Breck Baldwin,et al.  Dynamic Coreference-Based Summarization , 1998, EMNLP.

[25]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[26]  Dan Roth,et al.  Minimally Supervised Event Causality Identification , 2011, EMNLP.

[27]  Oren Etzioni,et al.  Generating Coherent Event Schemas at Scale , 2013, EMNLP.

[28]  Zeno Vendler,et al.  Verbs and Times , 1957, The Language of Time - A Reader.

[29]  Rachel Rudinger,et al.  Learning to predict script events from domain-specific text , 2015, *SEM@NAACL-HLT.

[30]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[31]  Eduard H. Hovy,et al.  A Typology of Near-Identity Relations for Coreference (NIDENT) , 2010, LREC.

[32]  Teruko Mitamura,et al.  Detecting Subevent Structure for Event Coreference Resolution , 2014, LREC.

[33]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[34]  Katrin Erk,et al.  Implicit Argument Prediction with Event Knowledge , 2018, NAACL.

[35]  Leo Obrst,et al.  The Rich Event Ontology , 2017, NEWS@ACL.

[36]  Tommaso Caselli,et al.  Storylines for structuring massive streams of news , 2015 .

[37]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[38]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[39]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[40]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[41]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[42]  E. F. Skorochod'ko Adaptive Method of Automatic Abstracting and Indexing , 1971, IFIP Congress.

[43]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[44]  Daniel Jurafsky,et al.  Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions , 2013, NAACL.

[45]  Joseph E. Grimes,et al.  The Thread of Discourse , 1984 .

[46]  Ralph Grishman,et al.  Event Detection and Domain Adaptation with Convolutional Neural Networks , 2015, ACL.