论文信息 - Diphone preparation for Bangla text to speech synthesis

Diphone preparation for Bangla text to speech synthesis

This paper presents methodologies involved in diphone preparation for Bangla text to speech synthesis. A concatenation based synthesis system comprises basically two modules- one is natural language processing and other is digital signal processing (DSP). Natural language processing implies converting text to its pronounceable text, called text normalization and the diphone selection method based on the normalized text is called Graphene to Phoneme (G2P) conversion. We developed a speech synthesizer for Bangla using diphone based concatenative approach. Diphone preparation, labeling and selection techniques are described in this paper.

M. Shahidur Rahman | Muhammad Masud Rashid | Md. Akter Hussain

[1] Ritu Sharma,et al. Speech Synthesis , 2006 .

[2] Asoke Kumar Datta,et al. Epoch synchronous non-overlap-add (ESNOLA) method-based concatenative speech synthesis system for Bangla , 2007, SSW.

[3] Firoj Alam,et al. Text to speech for Bangla language using festival , 2007 .

[4] Marc C. Beutnagel,et al. The AT & T NEXT-GEN TTS system , 1999 .

[5] Firoj Alam,et al. Text normalization system for Bangla , 2008 .