A Knowledge Acquisition Method for Event Extraction and Coding Based on Deep Patterns

A major problem in the field of peace and conflict studies is to extract events from a variety of news sources. The events need to be coded with an event type and annotated with entities from a domain specific ontology for future retrieval and analysis. The problem is dynamic in nature, characterised by new or changing groups and targets, and the emergence of new types of events. A number of automated event extraction systems exist that detect thousands of events on a daily basis. The resulting datasets, however, lack sufficient coverage of specific domains and suffer from too many duplicated and irrelevant events. Therefore expert event coding and validation is required to ensure sufficient quality and coverage of a conflict. We propose a new framework for semi-automatic rule-based event extraction and coding based on the use of deep syntactic-semantic patterns created from normal user input to an event annotation system. The method is implemented in a prototype Event Coding Assistant that processes news articles to suggest relevant events to a user who can correct or accept the suggestions. Over time as a knowledge base of patterns is built, event extraction accuracy improves and, as shown by analysis of system logs, the workload of the user is decreased.

[1]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[2]  Gerhard Weikum,et al.  A Fresh Look on Knowledge Bases: Distilling Named Events from News , 2014, CIKM.

[3]  Philip A. Schrodt,et al.  Three's a Charm?: Open Event Data Coding with EL:DIABLO, PETRARCH, and the Open Event Data Alliance. , 2014 .

[4]  Philip A. Schrodt,et al.  A Guide to Event Data: Past, Present, and Future , 2016 .

[5]  Joe Bond,et al.  Integrated Data for Events Analysis (IDEA): An Event Typology for Automated Events Data Development , 2003 .

[6]  Stephen M. Shellman,et al.  Coding Disaggregated Intrastate Conflict: Machine Processing the Behavior of Substate Actors Over Time and Space , 2008, Political Analysis.

[7]  Matthew Hayes,et al.  A Progressive Supervised-learning Approach to Generating Rich Civil Strife Data , 2015 .

[8]  Gary LaFree,et al.  Introducing the Global Terrorism Database , 2007 .

[9]  Rebecca H. Best,et al.  An analysis of the TABARI coding system , 2013 .

[10]  Ryan Kennedy,et al.  Making useful conflict predictions , 2015 .

[11]  Jing Liu,et al.  RBPB: Regularization-Based Pattern Balancing Method for Event Extraction , 2016, ACL.

[12]  James Hodson,et al.  Unsupervised Techniques for Extracting and Clustering Complex Events in News , 2014, EVENTS@ACL.

[13]  Philip A. Schrodt Automated Production of High-Volume, Real-Time Political Event Data , 2010 .

[14]  Peter M. A. Sloot,et al.  Extracting Biological Events from Text Using Simple Syntactic Patterns , 2011, BioNLP@ACL.

[15]  A. Hoffmann,et al.  Incremental knowledge acquisition for extracting temporal relations , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[16]  Philip A. Schrodt,et al.  Conflict and Mediation Event Observations (CAMEO): A New Event Data Framework for the Analysis of Foreign Policy Interactions , 2002 .

[17]  Sean P. O'Brien,et al.  Crisis Early Warning and Decision Support: Contemporary Approaches and Thoughts on Future Research , 2010 .

[18]  Clionadh Raleigh,et al.  Introducing ACLED: An Armed Conflict Location and Event Dataset , 2010 .

[19]  Stuart M. Shieber,et al.  An Introduction to Unification-Based Approaches to Grammar , 1986, CSLI Lecture Notes.

[20]  Rafael Valencia-García,et al.  An approach for incremental knowledge acquisition from text , 2003, Expert Syst. Appl..

[21]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.