Toward filling the gap between interactive and fully-automatic spelling correction using the linguistic context

We report on the comparison of different strategies for correcting spelling errors resulting in non-existent words. Unlike interactive spelling checkers, where usually only the left context is available, the system we developed takes advantage of the entire context surrounding misspelling. Moreover, unlike traditional systems, based exclusively on a string-to-string edit distance and a word language model, we explore the use of the part-of-speech for selecting candidates. In conclusion, we show that spelling correction improves by extending the context. The best results are obtained when combining a part-of-speech filter with a word language model, and using both the left and right adjacent contexts.

[1]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[2]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[3]  Kemal Oflazer,et al.  Error-tolerant Finite-state Recognition with Applications to Morphological Analysis and Spelling Correction , 1995, CL.

[4]  David Yarowsky,et al.  DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION: Application to Accent Restoration in Spanish and French , 1994, ACL.

[5]  Jean-Pierre Chanod,et al.  Tagging French - comparing a statistical and a constraint-based method , 1995, EACL.

[6]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[7]  Kenneth Ward Church,et al.  Probability scoring for spelling correction , 1991 .

[8]  Yves Schabes,et al.  Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction , 1996, ACL.

[9]  Eric Brill,et al.  Automatic Rule Acquisition for Spelling Correction , 1997, ICML.

[10]  Eric Brill,et al.  Pattern-Based Disambiguation for Natural Language Processing , 2000, EMNLP.

[11]  Dan Roth,et al.  Applying Winnow to Context-Sensitive Spelling Correction , 1996, ICML.

[12]  Antonio Zamora,et al.  Automatic spelling correction in scientific and scholarly text , 1984, CACM.

[13]  Peter N. Yianilos,et al.  Learning String-Edit Distance , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Robert H. Baud,et al.  Minimal Commitment and Full Lexical Disambiguation: Balancing Rules and Hidden Markov Models , 2000, CoNLL/LLL.

[15]  James L. Peterson,et al.  Computer programs for detecting and correcting spelling errors , 1980, CACM.

[16]  Eric Brill,et al.  An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.

[17]  Vladimir I. Levenshtein,et al.  On the Minimal Redundancy of Binary Error-Correcting Codes , 1975, Inf. Control..