I saw TREE trees in the park: How to Correct Real-Word Spelling Mistakes

This paper presents a context sensitive spell checking system that uses mixed trigram models, and introduces a new empirically grounded method for building confusion sets. The proposed method has been implemented, tested, and evaluated in terms of coverage, precision, and recall. The results show that the method is effective.

[1]  Roger Mitton,et al.  A Collection of Computer-Readable Corpora of English Spelling Errors. , 1985 .

[2]  Dan Roth,et al.  Applying Winnow to Context-Sensitive Spelling Correction , 1996, ICML.

[3]  James H. Martin,et al.  Contextual Spelling Correction Using Latent Semantic Analysis , 1997, ANLP.

[4]  Dan Roth,et al.  A Winnow-Based Approach to Context-Sensitive Spelling Correction , 1998, Machine Learning.

[5]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[6]  Robert L. Mercer,et al.  Context based spelling correction , 1991, Inf. Process. Manag..

[7]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[8]  Graeme Hirst,et al.  Correcting real-word spelling errors by restoring lexical cohesion , 2005, Natural Language Engineering.

[9]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[10]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[11]  G. Broll,et al.  Microsoft Corporation , 1999 .

[12]  Graeme Hirst,et al.  Real-Word Spelling Correction with Trigrams: A Reconsideration of the Mays, Damerau, and Mercer Model , 2008, CICLing.

[13]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[14]  David Yarowsky,et al.  DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION: Application to Accent Restoration in Spanish and French , 1994, ACL.

[15]  Martin Chodorow,et al.  The EPISTLE Text-Critiquing System , 1982, IBM Syst. J..

[16]  Ian Marshall,et al.  Choice of grammatical word-class without global syntactic analysis: Tagging words in the lob corpus , 1983, Comput. Humanit..

[17]  Klaus U. Schulz,et al.  Orthographic Errors in Web Pages: Toward Cleaner Web Corpora , 2006, Computational Linguistics.

[18]  Yves Schabes,et al.  Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction , 1996, ACL.

[19]  Roger Mitton,et al.  Spelling checkers, spelling correctors and the misspellings of poor spellers , 1987, Inf. Process. Manag..

[20]  Davide Fossati,et al.  A Mixed Trigrams Approach for Context Sensitive Spell Checking , 2009, CICLing.

[21]  Lawrence Philips,et al.  The double metaphone search algorithm , 2000 .

[22]  Michael Lesk,et al.  Review of The computational analysis of English: a corpus-based approach by Roger Garside, Geoffrey Leech, and Geoffrey Sampson. Longman 1987. , 1988 .

[23]  Andrew R. Golding,et al.  A Bayesian Hybrid Method for Context-sensitive Spelling Correction , 1996, VLC@ACL.

[24]  Xiang Tong,et al.  A Statistical Approach to Automatic OCR Error Correction in Context , 1996, VLC@COLING.

[25]  Dan Roth,et al.  Scaling Up Context-Sensitive Text Correction , 2001, IAAI.