Creating Extraction Pattern by Combining Part of Speech Tagger and Grammatical Parser

Most of the previous works in the field of extraction pattern are based on the usage of syntactic analyzer and semantic tagger to create a pattern that could extract relevant information from free text documents or more structured documents like web pages. In this paper, we propose an approach to create a set of extraction pattern by combining a particular Part Of Speech (POS) tagger and Grammatical Parser, i.e. Stanford POS Tagger and Link Grammar Parser (LG). The extraction pattern will be used in a Name Entity Recognition (NER) system to identify the occurrences of some entities in free text documents. We demonstrate the algorithm on accident report as a case study.

[1]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[2]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[3]  K. Minton Extraction Patterns for Information Extraction Tasks : A Survey , 1999 .

[4]  Scott B. Huffman,et al.  Learning information extraction patterns from examples , 1995, Learning for Natural Language Processing.

[5]  E. Riloff,et al.  Automated dictionary construction for information extraction from text , 1993, Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications.

[6]  Norshuhani Zamin,et al.  A Hybrid Approach to Semi-supervised Named Entity Recognition in Health, Safety and Environment Reports , 2009, 2009 International Conference on Future Computer and Communication.

[7]  Raymond J. Mooney,et al.  Relational Learning of Pattern-Match Rules for Information Extraction , 1999, CoNLL.

[8]  Peter Chapman,et al.  Grammar and Writing , 2001 .

[9]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[10]  Ellen Riloff,et al.  A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts , 2002, EMNLP.

[11]  Beverly Benson,et al.  Applied English Grammar , 2001 .

[12]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[13]  Jie Tang,et al.  Information Extraction: Methodologies and Applications , 2008 .