Corpus-driven learning of Event Recognition Rules

In this paper a complex framework for adaptaing IE systems to changing domains and users is described. The proposed methodology is based on the integration of different learning methods over a corpus (i.e. example-driven conceptual clustering, corpusdriven probabilistic learning and terminological reasoning) and on an available general-purpose ontology. First experiments have been carried out on domain-specific corpora (the annotated portion of the PennTree Bank and the Reuters TREVI collection) and used Wordnet [16] as the reference ontology. However, the methodology is independent from the specific domain as well as from the adopted ontology. Early evaluation is promising and first results will be presented and discussed.

[1]  Ralph Grishman,et al.  NYU: Description of the MENE Named Entity System as Used in MUC-7 , 1998, MUC.

[2]  Robert C. Berwick,et al.  Learning Structural Descriptions of Grammar Rules from Examples , 1979, IJCAI.

[3]  Roberto Basili,et al.  Lexical Acquisition and Information Extraction , 1997, SCIE.

[4]  Yorick Wilks,et al.  University of Sheffield: Description of the LaSIE System as Used for MUC-6 , 1995, MUC.

[5]  Roberto Basili,et al.  An empirical approach to Lexical Tuning , 2000 .

[6]  Claudio Carpineto,et al.  A lattice conceptual clustering system and its application to browsing retrieval , 2004, Machine Learning.

[7]  Fabio Rinaldi,et al.  FACILE: Description of the NE System Used for MUC-7 , 1998, MUC.

[8]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[9]  David Fisher,et al.  CRYSTAL: Inducing a Conceptual Dictionary , 1995, IJCAI.

[10]  Maria Teresa Pazienza,et al.  Information Extraction A Multidisciplinary Approach to an Emerging Information Technology , 1997, Lecture Notes in Computer Science.

[11]  Claire Cardie,et al.  A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis , 1993, AAAI.

[12]  Uri Zernik,et al.  Lexical acquisition: Exploiting on-line resources to build a lexicon. , 1991 .

[13]  Robert C. Berwick,et al.  The acquisition of syntactic knowledge , 1985 .

[14]  Yorick Wilks,et al.  University of Sheffield: description of the LaSIE system as used for MUC-6 , 1995, MUC.

[15]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[16]  Eneko Agirre,et al.  A Proposal for Word Sense Disambiguation using Conceptual Distance , 1995, ArXiv.

[17]  Ralph Grishman,et al.  NYU: Description of the Proteus/PET System as Used for MUC-7 ST , 1998, MUC.

[18]  Michael Lebowitz,et al.  UNIMEM, a General Learning System: An Overview , 1986, European Conference on Artificial Intelligence.

[19]  Roberto Basili,et al.  Corpus-Driven Unsupervised Learning of Verb Subcategorization Frames , 1997, AI*IA.

[20]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.