论文信息 - Applying Winnow to Context-Sensitive Spelling Correction - 字舞流文

Applying Winnow to Context-Sensitive Spelling Correction

Multiplicative weight-updating algorithms such as Winnow have been studied extensively in the COLT literature, but only recently have people started to use them in applications. In this paper, we apply a Winnow-based algorithm to a task in natural language: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting {\it to\/} for {\it too}, {\it casual\/} for {\it causal}, and so on. Previous approaches to this problem have been statistics-based; we compare Winnow to one of the more successful such approaches, which uses Bayesian classifiers. We find that: (1)~When the standard (heavily-pruned) set of features is used to describe problem instances, Winnow performs comparably to the Bayesian method; (2)~When the full (unpruned) set of features is used, Winnow is able to exploit the new features and convincingly outperform Bayes; and (3)~When a test set is encountered that is dissimilar to the training set, Winnow is better than Bayes at adapting to the unfamiliar test set, using a strategy we will present for combining learning on the training set with unsupervised learning on the (noisy) test set.

Dan Roth | Andrew R. Golding | D. Roth | Andrew R. Golding | Dan Roth

[1] H. Kucera,et al. Computational analysis of present-day American English , 1967 .

[2] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[3] Robert C. Holte,et al. Concept Learning and the Problem of Small Disjuncts , 1989, IJCAI.

[4] Robert L. Mercer,et al. Context based spelling correction , 1991, Inf. Process. Manag..

[5] Nick Littlestone,et al. Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow , 1991, COLT '91.

[6] David Yarowsky,et al. A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[7] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[8] David Haussler,et al. How to use expert advice , 1993, STOC.

[9] S. B. Flexner,et al. Random House unabridged dictionary , 1993 .

[10] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..

[11] David Yarowsky,et al. DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION: Application to Accent Restoration in Spanish and French , 1994, ACL.

[12] Leslie G. Valiant,et al. Circuits of the mind , 1994 .

[13] Andrew R. Golding,et al. A Bayesian Hybrid Method for Context-sensitive Spelling Correction , 1996, VLC@ACL.

[14] Nick Littlestone,et al. Comparing Several Linear-threshold Learning Algorithms on Tasks Involving Superfluous Attributes , 1995, ICML.

[15] Dan Roth,et al. Learning to Reason: The Non-Monotonic Case , 1995, IJCAI.

[16] Dan Roth. A Connectionist Framework for Reasoning: Reasoning with Examples , 1996, AAAI/IAAI, Vol. 2.

[17] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[18] Dan Roth,et al. Learning to reason , 1994, JACM.

[19] A. Blum. Learning Boolean Functions in an Infinite Attribute Space , 1992, Machine Learning.