Whose story is it anyway? Automatic extraction of accounts from news articles

Abstract Narratives are comprised of stories that provide insight into social processes. To facilitate the analysis of narratives in a more efficient manner, natural language processing (NLP) methods have been employed in order to automatically extract information from textual sources, e.g., newspaper articles. Existing work on automatic narrative extraction, however, has ignored the nested character of narratives. In this work, we argue that a narrative may contain multiple accounts given by different actors. Each individual account provides insight into the beliefs and desires underpinning an actor’s actions. We present a pipeline for automatically extracting accounts, consisting of NLP methods for: (1) named entity recognition, (2) event extraction, and (3) attribution extraction. Machine learning-based models for named entity recognition were trained based on a state-of-the-art neural network architecture for sequence labelling. For event extraction, we developed a hybrid approach combining the use of semantic role labelling tools, the FrameNet repository of semantic frames, and a lexicon of event nouns. Meanwhile, attribution extraction was addressed with the aid of a dependency parser and Levin’s verb classes. To facilitate the development and evaluation of these methods, we constructed a new corpus of news articles, in which named entities, events and attributions have been manually marked up following a novel annotation scheme that covers over 20 event types relating to socio-economic phenomena. Evaluation results show that relative to a baseline method underpinned solely by semantic role labelling tools, our event extraction approach optimises recall by 12.22–14.20 percentage points (reaching as high as 92.60% on one data set). Meanwhile, the use of Levin’s verb classes in attribution extraction obtains optimal performance in terms of F-score, outperforming a baseline method by 7.64–11.96 percentage points. Our proposed approach was applied on news articles focused on industrial regeneration cases. This facilitated the generation of accounts of events that are attributed to specific actors.

[1]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[2]  Michal Ptaszynski,et al.  A Method for Extraction of Future Reference Sentences Based on Semantic Role Labeling , 2016, IEICE Trans. Inf. Syst..

[3]  James R. Curran,et al.  A Sequence Labelling Approach to Quote Attribution , 2012, EMNLP.

[4]  Peter Abell Comparative Narratives: Some Rules for the Study of Action , 1984 .

[5]  Peter S. Bearman,et al.  Becoming a Nazi: A model for narrative networks☆ , 2000 .

[6]  Colin Mason,et al.  Industrial Decline in Greater Manchester 1966-1975: a Components of Change Approach , 1980 .

[7]  J. Elster Explaining Social Behavior: More Nuts and Bolts for the Social Sciences , 2007 .

[8]  James R. Curran,et al.  Automatically Detecting and Attributing Indirect Quotations , 2013, EMNLP.

[9]  Noah A. Smith,et al.  Frame-Semantic Parsing , 2014, CL.

[10]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[11]  Luciano Del Corro,et al.  MinIE: Minimizing Facts in Open Information Extraction , 2017, EMNLP.

[12]  Frank Boons,et al.  The Emergence of Collaborations , 2016 .

[13]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.

[14]  H. Abbott The Cambridge Introduction to Narrative , 2020 .

[15]  Chih-Ping Wei,et al.  Discovering Event Evolution Graphs From News Corpora , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[16]  Franck Dernoncourt,et al.  NeuroNER: an easy-to-use program for named-entity recognition based on neural networks , 2017, EMNLP.

[17]  Corinne Squire White trash pride and the exemplary black citizen: Counter-narratives of gender, “race” and the trailer park in contemporary daytime television talk shows , 2002 .

[18]  P. Abell Narrative Explanation: An Alternative to Variable-Centered Explanation? , 2004 .

[19]  Luke S. Zettlemoyer,et al.  Deep Semantic Role Labeling: What Works and What’s Next , 2017, ACL.

[20]  Chantal van Son,et al.  MEANTIME, the NewsReader Multilingual Event and Time Corpus , 2016, LREC.

[21]  Teruko Mitamura,et al.  Event Detection Using Frame-Semantic Parser , 2017, NEWS@ACL.

[22]  Nello Cristianini,et al.  Automating Quantitative Narrative Analysis of News Data , 2011, WAPA.

[23]  Alan Lee,et al.  Annotating Attribution in the Penn Discourse TreeBank , 2006 .

[24]  Tom Trabasso,et al.  The Development of Goal Plans of Action in the Narration of a Picture Story. , 1992 .

[25]  A. Abbott The System of Professions: An Essay on the Division of Expert Labor , 1988 .

[26]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[27]  Antske Fokkens,et al.  Building event-centric knowledge graphs from news , 2016, J. Web Semant..

[28]  Roberto Franzosi,et al.  The Return of The Actor. Interaction Networks Among Social Actors During Periods of High Mobilization (Italy, 1919-1922) , 1999 .

[29]  桝井 文人,et al.  Future Reference Sentence Extraction in Support of Future Event Prediction , 2020 .

[30]  Ann Langley,et al.  Strategy as Practice and the Narrative Turn , 2011 .

[31]  J. Elliott Using narrative in social research: qualitative and quantitative approaches , 2005 .

[32]  Andrew Abbott,et al.  From Causes to Events , 1992 .

[33]  Tommaso Caselli,et al.  The Event StoryLine Corpus: A New Benchmark for Causal and Temporal Relation Extraction , 2017, NEWS@ACL.

[34]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[35]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[36]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[37]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[38]  Jing Lu,et al.  Event Coreference Resolution: A Survey of Two Decades of Research , 2018, IJCAI.

[39]  Kenji Araki,et al.  Affect analysis in context of characters in narratives , 2013, Expert Syst. Appl..

[40]  Barbara Czarniawska Narrating the Organization: Dramas of Institutional Identity , 1997 .

[41]  Molly Andrews,et al.  Counter-narratives and the power to oppose , 2002 .

[42]  Robert Faris,et al.  Networks and history , 2002 .

[43]  Wei Luo,et al.  Speculation and Negation Scope Detection via Convolutional Neural Networks , 2016, EMNLP.