An approach to machine learning of Chinese Pinyin-to-character conversion for small-memory application

Chinese Pinyin-to-character conversion is used in Chinese character input through keyboard and Chinese speech recognition. The key of this kind of system is machine learning that fits system for specific user. In this paper, an effective approach of machine learning of Chinese Pinyin-to-character conversion for small-memory application is presented. The approach is based on iterative new word identification and word frequency increasing that results in more accurate segmentation of Chinese character gradually and satisfy the need of user finally. Applying proposed machine learning to Chinese character input system through keyboard improves accuracy of Pinyin-to-character conversion from 90% up to 98%. Such a system can run in very small memory (limited in 120 K) and satisfy the need of small-memory platform. With rapid development of digital appliances such as PDA, mobile telephone, intelligent refrigerator and etc., and with development of embedded operating system, Pinyin-to-character conversion presented in this paper has found its new application.