Towards Large-scale Non-taxonomic Relation Extraction: Estimating the Precision of Rote Extractors

In this paper, we describe a rote extractor that learns patterns for finding semantic relations in unrestricted text, with new procedures for pattern generalisation and scoring. An improved method for estimating the precision of the extracted patterns is presented. We show that our method approximates the precision values as evaluated by hand much better than the procedure traditionally used in rote extractors.

[1]  Manabu Okumura,et al.  A Rote Extractor with Edit Distance-Based Generalisation and Multi-Corpora Precision Calculation , 2006, ACL.

[2]  David Yarowsky,et al.  Multi-Field Information Extraction and Cross-Document Fusion , 2005, ACL.

[3]  Ido Dagan,et al.  Scaling Web-based Acquisition of Entailment Relations , 2004, EMNLP.

[4]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[5]  Ellen Riloff,et al.  An Empirical Approach to Conceptual Case Frame Acquisition , 1998, VLC@COLING/ACL.

[6]  Stephen Soderland,et al.  Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.

[7]  Raphael Volz,et al.  Semi-automatic Ontology Acquisition from a Corporate Intranet , 2000 .

[8]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[9]  Emmanuel Morin Projecting Corpus-Based Semantic Links on a Thesaurus , 1999, ACL.

[10]  Scott B. Huffman,et al.  Learning information extraction patterns from examples , 1995, Learning for Natural Language Processing.

[11]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[12]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[13]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[14]  Maria Ruiz-Casado,et al.  Automatising the learning of lexical patterns: An application to the enrichment of WordNet by extracting semantic relationships from Wikipedia , 2007, Data Knowl. Eng..

[15]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[16]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[17]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[18]  Dan I. Moldovan,et al.  Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations , 2003, NAACL.

[19]  David Yarowsky,et al.  Unsupervised Personal Name Disambiguation , 2003, CoNLL.

[20]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[21]  Emmanuel Morin,et al.  Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods , 1999 .

[22]  Antonio Moreno-Sandoval,et al.  The wraetlic NLP suite , 2006, LREC.

[23]  Steffen Staab,et al.  Towards the self-annotating web , 2004, WWW '04.

[24]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.