A Memory-Based Approach to Learning Shallow Natural Language Patterns

Recognizing shallow linguistic patterns, such as basic syntactic relationships between words, is a common task in applied natural language and text processing. The common practice for approaching this task is by tedious manual definition of possible pattern structures, often in the form of regular expressions or finite automata. This paper presents a novel memory-based learning method that recognizes shallow patterns in new text based on a bracketed training corpus. The training data are stored as-is, in efficient suffix-tree data structures. Generalization is performed on-line at recognition time by comparing subsequences of the new text to positive and negative evidence in the corpus. This way, no information in the training is lost, as can happen in other learning systems that construct a single generalized model at the time of training. The paper presents experimental results for recognizing noun phrase, subject-verb and verb-object patterns in English. Since the learning approach enables easy porting to new domains, we plan to apply it to syntactic patterns in other languages and to sub-language patterns for information extraction.

[1]  Douglas E. Appelt,et al.  FASTUS: A Finite-state Processor for Information Extraction from Real-world Text , 1993, IJCAI.

[2]  Daniel Gildea,et al.  Automatic Induction of Finite State Transducers for Simple Phonological Rules , 1995, ACL.

[3]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[4]  Fernando Pereira,et al.  Inside-Outside Reestimation From Partially Bracketed Corpora , 1992, HLT.

[5]  James Paul Gee,et al.  Performance structures: A psycholinguistic and linguistic appraisal , 1983, Cognitive Psychology.

[6]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[7]  Jorn Veenstra Sabine Buchholz Fast NP Chunking Using Memory-Based Learning Techniques , 1998 .

[8]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[9]  Eric Brill,et al.  Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach , 1993, ACL.

[10]  J. Veenstra,et al.  Fast NP Chunking using Memory-Based learning techniques , 1998 .

[11]  Khalil Simaan,et al.  Computational Complexity of Probabilistic Disambiguation by means of Tree-Grammars , 1996, COLING.

[12]  Walter Daelemans,et al.  MBT: A Memory-Based Part of Speech Tagger-Generator , 1996, VLC@COLING.

[13]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[14]  Ralph Grishman,et al.  Statistical Parsing of Messages , 1990, HLT.

[15]  Claire Cardie,et al.  Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification , 1998, ACL.

[16]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[17]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[18]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[19]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[20]  Anne Schiller,et al.  Multilingual Finite-State Noun Phrase Extraction , 1996 .

[21]  Mitchell P. Marcus,et al.  Pearl: A Probabilistic Chart Parser , 1991, EACL.

[22]  Howard C. Nusbaum,et al.  Pronounce : a program for pronunciation by analogy , 1991 .

[23]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[24]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[25]  David S. Day,et al.  Finite-state phrase parsing by rule sequences , 1996, COLING.

[26]  Andreas Stolcke,et al.  Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.

[27]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[28]  Shlomo Argamon,et al.  A Memory-Based Approach to Learning Shallow Natural Language Patterns , 1999, COLING.

[29]  Ted Briscoe,et al.  Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars , 1993, CL.

[30]  Giorgio Satta,et al.  String Transformation Learning , 1997, ACL.

[31]  LittlestoneNick Learning Quickly When Irrelevant Attributes Abound , 1988 .

[32]  Xerox Polo,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976 .

[33]  Jean-Pierre Chanod,et al.  Subject and Object Dependency Extraction Using Finite-State Transducers , 1997 .

[34]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[35]  Ralph Grishman,et al.  Corpus-based Parsing and Sublanguage Studies , 1998 .

[36]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[37]  Dan Roth,et al.  Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[38]  François Yvon Grapheme-to-Phoneme Conversion using Multiple Unbounded Overlapping Chunks , 1996, ArXiv.

[39]  Dana Ron,et al.  On the learnability and usage of acyclic probabilistic finite automata , 1995, COLT '95.

[40]  Gregory Grefenstette,et al.  Evaluation Techniques for Automatic Semantic Extraction: Comparing Syntactic and Window Based Approaches , 1996 .

[41]  Claire Cardie,et al.  A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis , 1993, AAAI.

[42]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[43]  Heikki Mannila,et al.  Forming Grammars for Structured Documents: an Application of Grammatical Inference , 1994, ICGI.

[44]  Robert C. Berwick,et al.  Principle-Based Parsing: Computation and Psycholinguistics , 1991 .

[45]  Adwait Ratnaparkhi,et al.  A Linear Observed Time Statistical Parser Based on Maximum Entropy Models , 1997, EMNLP.

[46]  Rens Bod,et al.  A Computational Model of Language Performance: Data Oriented Parsing , 1992, COLING.

[47]  Wojciech Skut,et al.  A Maximum-Entropy Partial Parser for Unrestricted Text , 1998, VLC@COLING/ACL.