Widening the Field of View of Information Extraction Through Sentential Event Recognition

Event-based Information Extraction (IE) is the task of identifying entities that play specific roles within an event described in free text. For example, given text documents containing descriptions of disease outbreak events, the goal of an IE system is to extract event role fillers, such as the disease, the victims, the location, the date, etc., of each disease outbreak described within the documents. IE systems typically rely on local clues around each phrase to identify their role within a relevant event. This research aims to improve IE performance by incorporating evidence from the wider sentential context to enable the IE model to make better decisions when faced with weak local contextual clues. To make better inferences about event role fillers, this research introduces an "event recognition" phase, which is used in combination with localized text extraction. The event recognizer operates on sentences and locates those sentences that discuss events of interest. Localized text extraction can then capitalize on this information and identify event role fillers even when the evidence in their local context is weak or inconclusive. First, this research presents PIPER, a pipelined approach for IE incorporating this idea. This model uses a classifier-based sentential event recognizer, combined with a pattern-based localized text extraction component, cascaded in a pipeline. This enables the pattern-based system to exploit sentential information for better IE coverage. Second, a unified probabilistic approach for IE, called GLACIER, is introduced to overcome limitations from the discrete nature of the pipelined model. GLACIER combines the probability of event sentences, with the probability of phrasal event role fillers into a single joint probability, which helps to better balance the influence of the two components in the IE model. An empirical evaluation of these models shows that the use of an event recognition phase improves IE performances, and it shows that incorporating such additional information through a unified probabilistic model produces the most effective IF system.

[1]  David Fisher,et al.  CRYSTAL: Inducing a Conceptual Dictionary , 1995, IJCAI.

[2]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[3]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[4]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[5]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[6]  Luis Gravano,et al.  Extracting Relations from Large Plain-Text Collections , 1999 .

[7]  Ralph Grishman,et al.  Real-time event extraction for infectious disease outbreaks , 2002 .

[8]  Roberta H. Merchant TIPSTER Program Overview , 1993, TIPSTER.

[9]  W. Bruce Croft,et al.  Passage retrieval based on language models , 2002, CIKM '02.

[10]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[11]  Rada Mihalcea,et al.  Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization , 2004, ACL.

[12]  Sanda M. Harabagiu,et al.  Open-domain textual question answering techniques , 2003, Natural Language Engineering.

[13]  Naomi Sager,et al.  Natural Language Information Processing: A Computer Grammar of English and Its Applications , 1980 .

[14]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[15]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[16]  Fabio Ciravegna,et al.  Adaptive Information Extraction from Text by Rule Induction and Generalisation , 2001, IJCAI.

[17]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[18]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[19]  A. Torralba,et al.  The role of context in object recognition , 2007, Trends in Cognitive Sciences.

[20]  George Hripcsak,et al.  Review Paper: Detecting Adverse Events Using Information Technology , 2003, J. Am. Medical Informatics Assoc..

[21]  James R. Cowie,et al.  Automatic Analysis of Descriptive Texts , 1983, ANLP.

[22]  Raymond J. Mooney,et al.  Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction , 2003, J. Mach. Learn. Res..

[23]  Beth M. Sundheim Overview of results of the MUC-6 evaluation , 1995, MUC.

[24]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[25]  Ralph Grishman,et al.  An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition , 2003, ACL.

[26]  Ellen Riloff,et al.  Exploiting Role-Identifying Nouns and Expressions for Information Extraction , 2007 .

[27]  D. Lindberg,et al.  The Unified Medical Language System , 1993, Methods of Information in Medicine.

[28]  Pedro M. Domingos,et al.  Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier , 1996, ICML.

[29]  Ralph Grishman,et al.  Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[30]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[31]  Beth Sundheim,et al.  Overview of the Third Message Understanding Evaluation and Conference , 1991, MUC.

[32]  Patrick J. Altomari,et al.  FOCUS OF TIPSTER PHASES I and II , 1996, TIPSTER.

[33]  John F. Hurdle,et al.  Automated identification of adverse events related to central venous catheters , 2007, J. Biomed. Informatics.

[34]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[35]  Gian Piero Zarri,et al.  Automatic Representation of the Semantic Relationships Corresponding to a French Surface Expression , 1983, ANLP.

[36]  Andrew Smith,et al.  Using Gazetteers in Discriminative Information Extraction , 2006, CoNLL.

[37]  Siddharth Patwardhan,et al.  Learning Domain-Specific Information Extraction Patterns from the Web , 2006 .

[38]  Gerald DeJong Prediction and substantiation: A new approach to natural language processing , 1979 .

[39]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[40]  Beth Sundheim,et al.  Overview of the Fourth Message Understanding Evaluation and Conference , 1992, MUC.

[41]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[42]  Richard Edward Cullingford,et al.  Script application: computer understanding of newspaper stories. , 1977 .

[43]  Gideon S. Mann,et al.  Analyses for elucidating current question answering technology , 2001, Natural Language Engineering.

[44]  Nick Cercone,et al.  Segment-Based Hidden Markov Models for Information Extraction , 2006, ACL.

[45]  Guodong Zhou,et al.  Extracting relation information from text documents by exploring various types of knowledge , 2007, Inf. Process. Manag..

[46]  Stephen Soderland,et al.  Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.

[47]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[48]  Jing Xiao,et al.  Cascading Use of Soft and Hard Matching Pattern Rules for Weakly Supervised Information Extraction , 2004, COLING.

[49]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[50]  Hwee Tou Ng,et al.  Named Entity Recognition: A Maximum Entropy Approach Using Global Information , 2002, COLING.

[51]  Ralph Grishman,et al.  Unsupervised Learning of Generalized Names , 2002, COLING.

[52]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[53]  Andrew McCallum,et al.  Accurate Information Extraction from Research Papers using Conditional Random Fields , 2004, NAACL.

[54]  Kun Yu,et al.  Resume Information Extraction with Cascaded Hybrid Model , 2005, ACL.

[55]  Sanda M. Harabagiu,et al.  Using Predicate-Argument Structures for Information Extraction , 2003, ACL.

[56]  Jerry R. Hobbs SRI International's TACITUS system: MUC-3 test results and analysis , 1991, MUC.

[57]  Fabio Ciravegna,et al.  LearningPinocchio: adaptive information extraction for real world applications , 2004, Natural Language Engineering.

[58]  Ian Witten,et al.  Data Mining , 2000 .

[59]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[60]  Kalina Bontcheva,et al.  Using Uneven Margins SVM and Perceptron for Information Extraction , 2005, CoNLL.

[61]  Roger C. Schank,et al.  SCRIPTS, PLANS, GOALS, AND UNDERSTANDING , 1988 .

[62]  F. Ruth Gee The TIPSTER Text Program Overview , 1998, TIPSTER.

[63]  Yuji Matsumoto,et al.  A new approach to unsupervised text summarization , 2001, SIGIR '01.

[64]  Tat-Seng Chua,et al.  Question answering passage retrieval using dependency relations , 2005, SIGIR '05.

[65]  Charles L. A. Clarke,et al.  Question Answering by Passage Selection (MultiText Experiments for TREC-9) , 2000, TREC.

[66]  Claire Cardie,et al.  Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns , 2005, HLT.

[67]  Nicholas Kushmerick,et al.  Transductive Pattern Learning for Information Extraction , 2006 .

[68]  Mark Stevenson,et al.  A Semantic Approach to IE Pattern Induction , 2005, ACL.

[69]  Hwee Tou Ng,et al.  Closing the Gap: Learning-Based Information Extraction Rivaling Knowledge-Engineering Methods , 2003, ACL.

[70]  Satoshi Sekine,et al.  Preemptive Information Extraction using Unrestricted Relation Discovery , 2006, NAACL.

[71]  Dan I. Moldovan,et al.  PALKA: a system for lexical knowledge acquisition , 1993, CIKM '93.

[72]  Douglas E. Appelt,et al.  FASTUS: A Finite-state Processor for Information Extraction from Real-world Text , 1993, IJCAI.

[73]  Lisa F. Rau,et al.  GE NLTooLSET: MUC-3 test results and analysis , 1991, MUC.

[74]  Charles L. A. Clarke,et al.  Exploiting redundancy in question answering , 2001, SIGIR '01.

[75]  Jimmy J. Lin,et al.  Quantitative evaluation of passage retrieval algorithms for question answering , 2003, SIGIR.

[76]  Tat-Seng Chua,et al.  A Multi-resolution Framework for Information Extraction from Free Text , 2007, ACL.

[77]  Daniel Marcu,et al.  Bayesian Query-Focused Summarization , 2006, ACL.

[78]  Thierry Poibeau,et al.  Generating Extraction Patterns from a Large Semantic Network and an Untagged Corpus , 2002, COLING 2002.

[79]  Dayne Freitag,et al.  Information Extraction from HTML: Application of a General Machine Learning Approach , 1998, AAAI/IAAI.

[80]  Robert J. Gaizauskas,et al.  Using Coreference Chains for Text Summarization , 1999, COREF@ACL.

[81]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[82]  Tat-Seng Chua,et al.  Mining dependency relations for query expansion in passage retrieval , 2006, SIGIR.

[83]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[84]  Christopher D. Manning,et al.  An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition , 2006, ACL.

[85]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[86]  Daniel Marcu,et al.  Sentence Level Discourse Parsing using Syntactic and Lexical Information , 2003, NAACL.

[87]  Peter Schäuble,et al.  Document and passage retrieval based on hidden Markov models , 1994, SIGIR '94.

[88]  Dragomir R. Radev,et al.  Question-answering by predictive annotation , 2000, SIGIR '00.

[89]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[90]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[91]  Gideon S. Mann,et al.  Reverse-Engineering Question/Answer Collections From Ordinary Text , 2008 .

[92]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[93]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[94]  Rohini K. Srihari,et al.  Question Answering Supported By Multiple Levels Of Information Extraction , 2008 .

[95]  Siddharth Patwardhan,et al.  Feature Subsumption for Opinion Analysis , 2006, EMNLP.

[96]  Siddharth Patwardhan,et al.  A Unified Model of Phrasal and Sentential Evidence for Information Extraction , 2009, EMNLP.

[97]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[98]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[99]  Adwait Ratnaparkhi,et al.  IBM's Statistical Question Answering System , 2000, TREC.

[100]  Yuji Matsumoto,et al.  Lexical Knowledge Acquisition , 2005 .

[101]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[102]  Razvan C. Bunescu,et al.  Multiple instance learning for sparse positive bags , 2007, ICML '07.

[103]  Razvan C. Bunescu,et al.  Collective Information Extraction with Relational Markov Networks , 2004, ACL.

[104]  Siddharth Patwardhan,et al.  Effective Information Extraction with Semantic Affinity Patterns and Relevant Regions , 2007, EMNLP.

[105]  Heng Ji,et al.  Improving Name Tagging by Reference Resolution and Relation Detection , 2005, ACL.

[106]  Dan Roth,et al.  Relational Learning via Propositional Algorithms: An Information Extraction Case Study , 2001, IJCAI.

[107]  James P. Callan,et al.  Passage-level evidence in document retrieval , 1994, SIGIR '94.

[108]  Scott B. Huffman,et al.  Learning information extraction patterns from examples , 1995, Learning for Natural Language Processing.

[109]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[110]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[111]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[112]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[113]  Dan Klein,et al.  Unsupervised Learning of Field Segmentation Models for Information Extraction , 2005, ACL.

[114]  Ellen Riloff,et al.  Exploiting Subjectivity Classification to Improve Information Extraction , 2005, AAAI.

[115]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[116]  Satoshi Sekine,et al.  On-Demand Information Extraction , 2006, ACL.

[117]  Ralph Grishman,et al.  New York University PROTEUS system: MUC-3 test results and analysis , 1991, MUC.

[118]  Ellen Riloff,et al.  An Introduction to the Sundance and AutoSlog Systems , 2011 .

[119]  Yorick Wilks,et al.  TIPSTER-Compatible Projects at Sheffield , 1996, TIPSTER.

[120]  James Allan,et al.  Approaches to passage retrieval in full text information systems , 1993, SIGIR.

[121]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[122]  Claire Cardie,et al.  University of Massachusetts: MUC-3 test results and analysis , 1991, MUC.

[123]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[124]  Aidan Finn,et al.  Multi-level Boundary Classification for Information Extraction , 2004, ECML.

[125]  Tat-Seng Chua,et al.  ARE: Instance Splitting Strategies for Dependency Relation-Based Information Extraction , 2006, ACL.

[126]  Ralph Grishman,et al.  Information extraction for enhanced access to disease outbreak reports , 2002, J. Biomed. Informatics.

[127]  Douglas E. Appelt,et al.  FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text , 1997, ArXiv.

[128]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[129]  Wei Li,et al.  A Question Answering System Supported by Information Extraction , 2000, ANLP.