In a preliminary study, we show the effect of spelling errors on an ad hoc information retrieval task. Then, we report on the comparison of different strategies for correcting spelling errors resulting in non-existent words. Unlike interactive spelling checkers, where usually only the left context is available, the system we developed takes advantage of the entire context surrounding misspelling. Moreover, unlike traditional systems, based exclusively on a string-to-string edit distance and a word language model, we explore the use of the part-of-speech for selecting candidates. In conclusion, we show that spelling correction improves by extending the context. The best results are obtained when combining a part-of-speech filter with a word language model, and using both the left and right adjacent contexts.
[1]
Robert A. Greenes,et al.
Patient and Clinician Vocabulary: How Different Are They?
,
2001,
MedInfo.
[2]
Robert H. Baud,et al.
Minimal Commitment and Full Lexical Disambiguation: Balancing Rules and Hidden Markov Models
,
2000,
CoNLL/LLL.
[3]
James L. Peterson,et al.
Computer programs for detecting and correcting spelling errors
,
1980,
CACM.
[4]
Kemal Oflazer,et al.
Error-tolerant Finite-state Recognition with Applications to Morphological Analysis and Spelling Correction
,
1995,
CL.
[5]
Fred J. Damerau,et al.
A technique for computer detection and correction of spelling errors
,
1964,
CACM.