A Maximum Entropy Approach to Chinese Spelling Check

Spelling check identifies incorrect writing words in documents. For the reason of input methods, Chinese spelling check is much different from English and it is still a challenging work. For the past decade years, most of the methods in detecting errors in documents are lexicon-based or probability-based, and much progress are made. In this paper, we propose a new method in Chinese spelling check by using maximum entropy (ME). Experiment shows that by importing a large raw corpus, maximum entropy can build a well-trained model to detect spelling errors in Chinese documents.

[1]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[2]  Zhou Ming,et al.  Automatic error detection and correction approach in Chinese text based on features and learning , 2000, Proceedings of the 3rd World Congress on Intelligent Control and Automation (Cat. No.00EX393).

[3]  Zhang Lei,et al.  Automatic Chinese text error correction approach based-on fast approximate Chinese word-matching algorithm , 2000, Proceedings of the 3rd World Congress on Intelligent Control and Automation (Cat. No.00EX393).

[4]  Zhou Ming,et al.  Approach in automatic detection and correction of errors in Chinese text based on feature and learning , 2000 .

[5]  Qiang Zhou,et al.  A hybrid approach to automatic Chinese text checking and error correction , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[6]  C.-Y. Lee,et al.  Visually and Phonologically Similar Characters in Incorrect Chinese Words: Analyses, Identification, and Applications , 2011, TALIP.

[7]  Mei-Chen Wu,et al.  Error Detection and Correction Based on Chinese Phonemic Alphabet in Chinese Text , 2007, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[8]  Lei Zhang,et al.  Automatic Detecting/Correcting Errors in Chinese Text by an Approximate Word-Matching Algorithm , 2000, ACL.