Bootstrapped Training of Event Extraction Classifiers

Most event extraction systems are trained with supervised learning and rely on a collection of annotated documents. Due to the domain-specificity of this task, event extraction systems must be retrained with new annotated data for each domain. In this paper, we propose a bootstrapping solution for event role filler extraction that requires minimal human supervision. We aim to rapidly train a state-of-the-art event extraction system using a small set of "seed nouns" for each event role, a collection of relevant (in-domain) and irrelevant (out-of-domain) texts, and a semantic dictionary. The experimental results show that the bootstrapped system outperforms previous weakly supervised event extraction systems on the MUC-4 data set, and achieves performance levels comparable to supervised training with 700 manually annotated documents.

[1]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[2]  Kalina Bontcheva,et al.  Using Uneven Margins SVM and Perceptron for Information Extraction , 2005, CoNLL.

[3]  Satoshi Sekine,et al.  Preemptive Information Extraction using Unrestricted Relation Discovery , 2006, NAACL.

[4]  Kun Yu,et al.  Resume Information Extraction with Cascaded Hybrid Model , 2005, ACL.

[5]  Ralph Grishman,et al.  Using Document Level Cross-Event Inference to Improve Event Extraction , 2010, ACL.

[6]  Dayne Freitag,et al.  Multistrategy Learning for Information Extraction , 1998, ICML.

[7]  Dan I. Moldovan,et al.  Acquisition of semantic patterns for information extraction from corpora , 1993, Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications.

[8]  Aidan Finn,et al.  Multi-level Boundary Classification for Information Extraction , 2004, ECML.

[9]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[10]  Nick Cercone,et al.  Segment-Based Hidden Markov Models for Information Extraction , 2006, ACL.

[11]  Tat-Seng Chua,et al.  A Multi-resolution Framework for Information Extraction from Free Text , 2007, ACL.

[12]  Siddharth Patwardhan,et al.  Widening the Field of View of Information Extraction Through Sentential Event Recognition , 2010 .

[13]  Ellen Riloff,et al.  Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts , 2011, ACL.

[14]  S. Sathiya Keerthi,et al.  A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[15]  Siddharth Patwardhan,et al.  Effective Information Extraction with Semantic Affinity Patterns and Relevant Regions , 2007, EMNLP.

[16]  Siddharth Patwardhan,et al.  A Unified Model of Phrasal and Sentential Evidence for Information Extraction , 2009, EMNLP.

[17]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[18]  Scott B. Huffman,et al.  Learning information extraction patterns from examples , 1995, Learning for Natural Language Processing.

[19]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[20]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[21]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[22]  Fabio Ciravegna,et al.  Adaptive Information Extraction from Text by Rule Induction and Generalisation , 2001, IJCAI.

[23]  Ellen Riloff,et al.  Exploiting Role-Identifying Nouns and Expressions for Information Extraction , 2007 .

[24]  Hwee Tou Ng,et al.  A maximum entropy approach to information extraction from semi-structured and free text , 2002, AAAI/IAAI.

[25]  Ralph Grishman,et al.  Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[26]  Ellen Riloff,et al.  Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing , 2010, ACL.

[27]  Raymond J. Mooney,et al.  Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction , 2003, J. Mach. Learn. Res..

[28]  David Fisher,et al.  CRYSTAL: Inducing a Conceptual Dictionary , 1995, IJCAI.

[29]  Ralph Grishman,et al.  An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition , 2003, ACL.

[30]  Mark Stevenson,et al.  A Semantic Approach to IE Pattern Induction , 2005, ACL.

[31]  Satoshi Sekine,et al.  On-Demand Information Extraction , 2006, ACL.

[32]  Ellen Riloff,et al.  An Introduction to the Sundance and AutoSlog Systems , 2011 .

[33]  Dayne Freitag,et al.  Toward General-Purpose Learning for Information Extraction , 1998, ACL.

[34]  Nathanael Chambers,et al.  Template-Based Information Extraction without the Templates , 2011, ACL.