Multi-Event Extraction Guided by Global Constraints

This paper addresses the extraction of event records from documents that describe multiple events. Specifically, we aim to identify the fields of information contained in a document and aggregate together those fields that describe the same event. To exploit the inherent connections between field extraction and event identification, we propose to model them jointly. Our model is novel in that it integrates information from separate sequential models, using global potentials that encourage the extracted event records to have desired properties. While the model contains high-order potentials, efficient approximate inference can be performed with dual-decomposition. We experiment with two data sets that consist of newspaper articles describing multiple terrorism events, and show that our model substantially outperforms traditional pipeline models.

[1]  Andrew McCallum,et al.  Fast and Robust Joint Models for Biomedical Event Extraction , 2011, EMNLP.

[2]  Naicong Li,et al.  MUC-4 Test Results and Analysis , 1992 .

[3]  Christopher D. Manning,et al.  Joint Parsing and Named Entity Recognition , 2009, NAACL.

[4]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[5]  Tat-Seng Chua,et al.  A Multi-resolution Framework for Information Extraction from Free Text , 2007, ACL.

[6]  Siddharth Patwardhan,et al.  A Unified Model of Phrasal and Sentential Evidence for Information Extraction , 2009, EMNLP.

[7]  Alexander M. Rush,et al.  On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing , 2010, EMNLP.

[8]  Andrew McCallum,et al.  Collective Cross-Document Relation Extraction Without Labelled Data , 2010, EMNLP.

[9]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[10]  Tommi S. Jaakkola,et al.  Introduction to dual composition for inference , 2011 .

[11]  Nathanael Chambers,et al.  Template-Based Information Extraction without the Templates , 2011, ACL.

[12]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[13]  Andrew McCallum,et al.  Generalized Expectation Criteria for Bootstrapping Extractors using Record-Text Alignment , 2009, EMNLP.

[14]  Alexander M. Rush,et al.  Dual Decomposition for Parsing with Non-Projective Head Automata , 2010, EMNLP.

[15]  Regina Barzilay,et al.  In-domain Relation Discovery with Meta-constraints via Posterior Regularization , 2011, ACL.

[16]  Douglas E. Appelt,et al.  MUC-4 TEST RESULTS AND ANALYSIS , 2006 .

[17]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[18]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[19]  David Yarowsky,et al.  One Sense Per Discourse , 1992, HLT.

[20]  Ming-Wei Chang,et al.  Guiding Semi-Supervision with Constraint-Driven Learning , 2007, ACL.

[21]  Andrew McCallum,et al.  High-Performance Semi-Supervised Learning using Discriminatively Constrained Generative Models , 2010, ICML.

[22]  Nancy Chinchor,et al.  MUC-4 evaluation metrics , 1992, MUC.

[23]  Siddharth Patwardhan,et al.  Effective Information Extraction with Semantic Affinity Patterns and Relevant Regions , 2007, EMNLP.

[24]  Romaric Besançon,et al.  Text Segmentation and Graph-based Method for Template Filling in Information Extraction , 2011, IJCNLP.

[25]  D. Sontag 1 Introduction to Dual Decomposition for Inference , 2010 .

[26]  Hwee Tou Ng,et al.  Closing the Gap: Learning-Based Information Extraction Rivaling Knowledge-Engineering Methods , 2003, ACL.

[27]  Lynette Hirschman,et al.  Evaluating Message Understanding Systems: An Analysis of the Third Message Understanding Conference (MUC-3) , 1993, CL.

[28]  Dan Klein,et al.  Structure compilation: trading structure for features , 2008, ICML '08.

[29]  Jing Xiao,et al.  Cascading Use of Soft and Hard Matching Pattern Rules for Weakly Supervised Information Extraction , 2004, COLING.

[30]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.