A bi-lingual Thai-English TTS system on Android mobile devices

This paper presents a bi-lingual Thai-English text-to-speech synthesis (TTS) system on Android mobile devices. The system deploys a Thai text processor and a well-known open-source English text processor, which can analyzes English text at high intelligibility. With hidden Markov model (HMM) based speech unit and audio streaming optimization, it can synthesize highly smoothed sounds at a fast response. This paper reveals the optimization of important components. Conditional random fields (CRF) successfully used in Thai word segmentation and a syllable-pattern based statistical modeling for Thai grapheme-to-phoneme conversion are assessed. Several types of speech parameters are compared for best performance. The optimized system produced as high as 3.68 mean opinion score (MOS) with response less than 2 seconds on both high and low specification devices.

[1]  K. Wongpatikaseree,et al.  A real-time Thai speech synthesizer on a mobile device , 2009, 2009 Eighth International Symposium on Natural Language Processing.

[2]  H. Zen,et al.  An HMM-based speech synthesis system applied to English , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[3]  C. Haruechaiyasak,et al.  A comparative study on Thai word segmentation approaches , 2008, 2008 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology.

[4]  A. Suchato,et al.  Implementing Thai text-to-speech synthesis for hand-held devices , 2008, 2008 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology.

[5]  Chai Wutiwiwatchai,et al.  Automatic syllable-pattern induction in statistical Thai text-to-phone transcription , 2006, INTERSPEECH.

[6]  Alan W. Black,et al.  Flite: a small fast run-time synthesis engine , 2001, SSW.