论文信息 - Memory-based context-sensitive spelling correction at web scale

Memory-based context-sensitive spelling correction at web scale

We study the problem of correcting spelling mistakes in text using memory-based learning techniques and a very large database of token n-gram occurrences in web text as training data. Our approach uses the context in which an error appears to select the most likely candidate from words which might have been intended in its place. Using a novel correction algorithm and a massive database of training data, we demonstrate higher accuracy on correcting real- word errors than previous work, and very high accuracy at a new task of ranking corrections to non-word errors given by a standard spelling correction package.

Andrew Carlson | Ian Fette | Andrew Carlson | Ian Fette

[1] Kenneth Ward Church,et al. Probability scoring for spelling correction , 1991 .

[2] Dan Roth,et al. Applying Winnow to Context-Sensitive Spelling Correction , 1996, ICML.

[3] Dan Roth,et al. Scaling Up Context-Sensitive Text Correction , 2001, IAAI.

[4] Michele Banko,et al. Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.

[5] Mirella Lapata,et al. Web-based models for natural language processing , 2005, TSLP.

[6] James R. Curran,et al. Web Text Corpus for Natural Language Processing , 2006, EACL.