论文信息 - Text Chunking using Transformation-Based Learning

Text Chunking using Transformation-Based Learning

Transformation-based learning, a technique introduced by Eric Brill (1993b), has been shown to do part-of-speech tagging with fairly high accuracy. This same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks. For this purpose, it is convenient to view chunking as a tagging problem by encoding the chunk structure in new tags attached to each word. In automatic tests using Treebank-derived data, this technique achieved recall and precision rates of roughly 93% for baseNP chunks (trained on 950K words) and 88% for somewhat more complex chunks that partition the sentence (trained on 200K words). Working in this new application and with larger template and training sets has also required some interesting adaptations to the transformation-based learning approach.

Mitchell P. Marcus | Lance A. Ramshaw | L. Ramshaw | M. Marcus

[1] James Paul Gee,et al. Performance structures: A psycholinguistic and linguistic appraisal , 1983, Cognitive Psychology.

[2] Eva I. Ejerhed,et al. Finding Clauses in Unrestricted Text by Finitary and Stochastic Methods , 1988, ANLP.

[3] Kenneth Ward Church. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[4] Steven Abney,et al. Parsing By Chunks , 1991 .

[5] Didier Bourigault,et al. Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases , 1992, COLING.

[6] E. Brill,et al. Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach , 1993, HLT.

[7] Eric Brill,et al. A corpus-based approach to language learning , 1993 .

[8] Atro Voutilainen,et al. NPtool, a Detector of English Noun Phrases , 1995, VLC@ACL.

[9] Eric Brill,et al. Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach , 1993, ACL.

[10] Julian Kupiec,et al. An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora , 1993, ACL.

[11] Mitchell P. Marcus,et al. Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging , 1994, ArXiv.

[12] Eric Brill,et al. Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[13] Eric Brill,et al. A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.