Leveraging known Semantics for Spelling Correction

Focusing on applications for analyzing learner language which evaluate semantic appropriateness and accuracy, we build from previous work which modeled some aspects of interaction, namely a picture description task (PDT), with the goal of integrating a spelling correction component in this context. After parsing a sentence and extracting semantic relations, a surprising number of analysis failures stem from misspellings, deviating from expected input in ways that can be modeled when the content of the interaction is known. We thus explore the use of spelling correction tools and language modeling to correct misspellings that often lead to errors in obtaining semantic forms, and we show that such tools can significantly reduce the number of unanalyzable cases. The work is useful for any context where image descriptions or some expected content is available, but not necessarily expected linguistic forms.

[1]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[2]  Shankar Kumar,et al.  Normalization of non-standard words , 2001, Comput. Speech Lang..

[3]  Michael Flor,et al.  On using context for automatic correction of non-word misspellings in student essays , 2012, BEA@NAACL-HLT.

[4]  C. Chapelle,et al.  Natural Language Processing and Language Learning , 2012 .

[5]  Claudia Leacock,et al.  Automated Grammatical Error Correction for Language Learners , 2010, COLING.

[6]  Michael Flor,et al.  Four types of context for automatic spelling correction , 2012, TAL.

[7]  Markus Dickinson,et al.  Shallow Semantic Analysis of Interactive Learner Sentences , 2013, BEA@NAACL-HLT.

[8]  Trude Heift,et al.  Heift Trude Schulze Mathias. Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues Routledge (Routledge series in computer-assisted language learning), 2007. xviii+283 Pages. ISBN: 978-0-415-36191-0. Price: $115 , 2009, ReCALL.

[9]  Yoko Futagi,et al.  Patterns of misspellings in L2 and L1 English: a view from the ETS Spelling Corpus , 2015 .

[10]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[11]  Kenneth A. Petersen,et al.  Implicit corrective feedback in computer-guided interaction : does mode matter? , 2010 .

[12]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[13]  Swapna Somasundaran,et al.  Automated Measures of Specific Vocabulary Knowledge from Constructed Responses ('Use These Words to Write a Sentence Based on this Picture') , 2014, BEA@ACL.

[14]  K. Forbes-McKay,et al.  Detecting subtle spontaneous language decline in early Alzheimer’s disease with a picture description task , 2005, Neurological Sciences.

[15]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[16]  R. Ellis Task-based research and language pedagogy , 2000 .

[17]  Detmar Meurers,et al.  Integrating parallel analysis modules to evaluate the meaning of answers to reading comprehension questions , 2011 .

[18]  Walt Detmar Meurers,et al.  Evaluating the Meaning of Answers to Reading Comprehension Questions: A Semantics-Based Approach , 2012, BEA@NAACL-HLT.

[19]  Robert Dale,et al.  HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task , 2012, BEA@NAACL-HLT.

[20]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.