Randomized rule selection in transformation-based learning: a comparative study

Transformation-Based Learning (TBL) is a relatively new machine learning method that has achieved notable success on language problems. This paper presents a variant of TBL, called Randomized TBL, that overcomes the training time problems of standard TBL without sacrificing accuracy. It includes a set of experiments on part-of-speech tagging in which the size of the corpus and template set are varied. The results show that Randomized TBL can address problems that are intractable in terms of training time for standard TBL. In addition, for language problems such as dialogue act tagging where the most effective features have not been identified through linguistic studies, Randomized TBL allows the researcher to experiment with a large set of templates capturing many potentially useful features and feature interactions.

[1]  Srinivas Bangalore,et al.  Complexity of lexical descriptions and its relevance to partial parsing , 1997 .

[2]  Ken Samuel,et al.  An Investigation of Transformation-Based Learning in Discourse , 1998, ICML.

[3]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[4]  di Padovavia Gradenigo,et al.  Eecient Transformation-based Parsing , 1996 .

[5]  Norbert Reithinger,et al.  Predicting dialogue acts for a speech-to-speech translation system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  Ken Samuel,et al.  Dialogue Act Tagging with Transformation-Based Learning , 1998, ACL.

[7]  Ken Samuel,et al.  Lazy Transformation-Based Learning , 1998, FLAIRS.

[8]  Giorgio Satta,et al.  Efficient Transformation-Based Parsing , 1996, ACL.

[9]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[10]  Beáta Megyesi,et al.  Improving Brill’s POS Tagger for an Agglutinative Language , 1999, EMNLP.

[11]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[12]  Norbert Reithinger,et al.  Dialogue act classification using language models , 1997, EUROSPEECH.

[13]  Torbjörn Lager The µ-TBL System: Logic Programming Tools for Transformation-Based Learning , 1999, CoNLL.

[14]  Mitchell P. Marcus,et al.  Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging , 1994, ArXiv.