Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users

Logs of user queries to an internet search engine provide a large amount of implicit and explicit information about language. In this paper, we investigate their use in spelling correction of search queries, a task which poses many additional challenges beyond the traditional spelling correction problem. We present an approach that uses an iterative transformation of the input query strings into other strings that correspond to more and more likely queries according to statistics extracted from internet search query logs.

[1]  M. Douglas,et al.  Development of a Spelling List , 1982 .

[2]  Eric Brill,et al.  Automatic Rule Acquisition for Spelling Correction , 1997, ICML.

[3]  Kenneth Ward Church,et al.  A Spelling Correction Program Based on a Noisy Channel Model , 1990, COLING.

[4]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[5]  Martin Chodorow,et al.  The EPISTLE Text-Critiquing System , 1982, IBM Syst. J..

[6]  Michael Lesk,et al.  Review of The computational analysis of English: a corpus-based approach by Roger Garside, Geoffrey Leech, and Geoffrey Sampson. Longman 1987. , 1988 .

[7]  Eric Brill,et al.  An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.

[8]  Robert L. Mercer,et al.  Context based spelling correction , 1991, Inf. Process. Manag..

[9]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[10]  James L. Peterson,et al.  Computer Programs for Spelling Correction: An Experiment in Program Design , 1980, Lecture Notes in Computer Science.

[11]  C. Chapelle The Computational Analysis of English—A Corpus‐Based Approach , 1988 .

[12]  Kristina Toutanova,et al.  Pronunciation Modeling for Improved Spelling Correction , 2002, ACL.

[13]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[14]  Dan Roth,et al.  Applying Winnow to Context-Sensitive Spelling Correction , 1996, ICML.

[15]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[16]  Allen R. Hanson,et al.  A Contextual Postprocessing System for Error Correction Using Binary n-Grams , 1974, IEEE Transactions on Computers.

[17]  Andrew R. Golding,et al.  A Bayesian Hybrid Method for Context-sensitive Spelling Correction , 1996, VLC@ACL.

[18]  M. D. McIlroy,et al.  Development of a Spelling List , 1982, IEEE Trans. Commun..