Pronunciation Modeling in Spelling Correction for Writers of English as a Foreign Language

We propose a method for modeling pronunciation variation in the context of spell checking for non-native writers of English. Spell checkers, typically developed for native speakers, fail to address many of the types of spelling errors peculiar to non-native speakers, especially those errors influenced by differences in phonology. Our model of pronunciation variation is used to extend a pronouncing dictionary for use in the spelling correction algorithm developed by Toutanova and Moore (2002), which includes models for both orthography and pronunciation. The pronunciation variation modeling is shown to improve performance for misspellings produced by Japanese writers of English.

[1]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[3]  Antonio Zamora,et al.  Automatic spelling correction in scientific and scholarly text , 1984, CACM.

[4]  Roger Mitton,et al.  Spelling checkers, spelling correctors and the misspellings of poor spellers , 1987, Inf. Process. Manag..

[5]  Kenneth Ward Church,et al.  Probability scoring for spelling correction , 1991 .

[6]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[7]  Edward A. Fox,et al.  A faster algorithm for constructing minimal perfect hash functions , 1992, SIGIR '92.

[8]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[9]  Steve Young,et al.  The HTK book , 1995 .

[10]  Roger Mitton,et al.  English spelling and the computer , 1995 .

[11]  Nelson Morgan,et al.  Dynamic pronunciation models for automatic speech recognition , 1999 .

[12]  William M. Fisher A statistical text-to-phone function using ngrams and rules , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[13]  Eric Brill,et al.  An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.

[14]  Kristina Toutanova,et al.  Pronunciation Modeling for Improved Spelling Correction , 2002, ACL.

[15]  Nobuaki Minematsu,et al.  English Speech Database Read by Japanese Learners for CALL System Development , 2002, LREC.

[16]  T. Okada A Corpus Analysis of Spelling Errors Made by Japanese EFL Writers , 2004 .

[17]  Roger Mitton,et al.  The adaptation of an English spellchecker for Japanese writers , 2007 .