Malapropisms Detection and Correction using a Paronyms Dictionary, a Search Engine and Wordnet

This paper presents a method for the automatic detection and correction of malapropism errors found in documents using the WordNet lexical database, a search engine (Google) and a paronyms dictionary. The malapropisms detection is based on the evaluation of the cohesion of the local context using the search engine, while the correction is done using the whole text cohesion evaluated in terms of lexical chains built using the linguistic ontology. The correction candidates, which are taken from the paronyms dictionary, are evaluated versus the local and the whole text cohesion in order to find the best candidate that is chosen for replacement. The testing methods of the application are presented, along with the obtained results.

[2]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[3]  Christiane Fellbaum,et al.  Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms , 1998 .

[4]  Ian Marshall,et al.  Choice of grammatical word-class without global syntactic analysis: Tagging words in the lob corpus , 1983, Comput. Humanit..

[5]  Robert L. Mercer,et al.  Context based spelling correction , 1991, Inf. Process. Manag..

[6]  Graeme Hirst,et al.  Correcting real-word spelling errors by restoring lexical cohesion , 2005, Natural Language Engineering.

[7]  Yves Schabes,et al.  Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction , 1996, ACL.

[8]  Alexander F. Gelbukh,et al.  On Correction of Semantic Errors in Natural Language Texts with a Dictionary of Literal Paronyms , 2004, AWIC.

[9]  Alexander F. Gelbukh,et al.  Detection and Correction of Malapropisms in Spanish by Means of Internet Search , 2005, TSD.

[10]  Andrew R. Golding,et al.  A Bayesian Hybrid Method for Context-sensitive Spelling Correction , 1996, VLC@ACL.

[11]  Maguelonne Teisseire,et al.  19th International Conference on Applications of Natural Language to Information Systems , 2014 .

[12]  Alexander F. Gelbukh,et al.  On Detection of Malapropisms by Multistage Collocation Testing , 2003, NLDB.

[13]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[14]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[15]  Graeme Hirst,et al.  Real-Word Spelling Correction with Trigrams: A Reconsideration of the Mays, Damerau, and Mercer Model , 2008, CICLing.