A Report of Recent Progress in Transformation-Based Error-Driven Learning

Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of thousands of lexical and contextual probabilities. In [Brill 92], a trainable rule-based tagger was described that obtained performance comparable to that of stochastic taggers, but captured relevant linguistic information in a small number of simple non-stochastic rules. In this paper, we describe a number of extensions to this rule-based tagger. First, we describe a method for expressing lexical relations in tagging that stochastic taggers are currently unable to express. Next, we show a rule-based approach to tagging unknown words. Finally, we show how the tagger can be extended into a k-best tagger, where multiple tags can be assigned to words in some cases of uncertainty.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Steven J. DeRose,et al.  Grammatical Category Disambiguation by Statistical Optimization , 1988, CL.

[3]  Carl de Marcken,et al.  Parsing the LOB Corpus , 1990, ACL.

[4]  B. Merialdo,et al.  Tagging text with a probabilistic model , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Penelope Sibun,et al.  A Practical Part-of-Speech Tagger , 1992, ANLP.

[6]  Robert F. Simmons,et al.  A Computational Approach to Grammatical Coding of English Words , 1963, JACM.

[7]  Eugene Charniak,et al.  Equations for Part-of-Speech Tagging , 1993, AAAI.

[8]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[9]  Roger K. Moore Computer Speech and Language , 1986 .

[10]  Keh-Yih Su,et al.  A New Quantitative Quality Measure for Machine Translation Systems , 1992, COLING.

[11]  Julian M. Kupiec,et al.  Robust part-of-speech tagging using a hidden Markov model , 1992 .

[12]  SchwartzRichard,et al.  Coping with ambiguity and unknown words through probabilistic models , 1993 .

[13]  Frederick Jelinek,et al.  Markov Source Modeling of Text Generation , 1985 .

[14]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[15]  Richard M. Schwartz,et al.  Coping with Ambiguity and Unknown Words through Probabilistic Models , 1993, CL.

[16]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[17]  Eric Brill,et al.  Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach , 1993, ACL.

[18]  Eric Brill,et al.  A corpus-based approach to language learning , 1993 .