A framework for Bangla text to speech synthesis

We describe a basic framework and methodology to convert Bangla Text to Speech. Articulated words are automatically produced from Bangla input text by the methodology from the basic pronunciation of the Bangla words. The single tone syllables are considered as the fundamental units for analysis. The methodology selects phonetic units from uttered vocabulary and then combined the appropriate diphones to get the final output. The uttered syllables are selected on the basis of synthesized input text. The input text is analyzed according to the very basic grammar rules of Bangla pronunciation. Using the methodology, we developed a prototype system and the output generated from the system has been analyzed with the natural human voice.

[1]  J. L. Le Saint-Milon,et al.  A real-time French text-to-speech system generating high-quality synthetic speech , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Joseph P. Olive,et al.  Text-to-speech synthesis , 1995, AT&T Technical Journal.

[3]  K. M. Azharul Hasan,et al.  Sentiment Recognition from Bangla Text , 2013 .

[4]  W. Ainsworth A system for converting english text into speech , 1973 .

[5]  V. Ramu Reddy,et al.  Development of syllable-based text to speech synthesis system in Bengali , 2011, Int. J. Speech Technol..

[6]  Mohammad A. Karim,et al.  Technical Challenges and Design Issues in Bangla Language Processing , 2013 .

[7]  Shankar Kumar,et al.  Normalization of non-standard words , 2001, Comput. Speech Lang..

[8]  Shyamal Kumar Das Mandal,et al.  A Bengali Speech Synthesizer on Android OS , 2012, SMIAE@ACL.

[9]  Foyzul Hassan,et al.  Speech Feature Evaluation for Bangla Automatic Speech Recognition , 2013 .

[10]  M. Shahidur Rahman,et al.  Text Normalization and Diphone Preparation for Bangla Speech Synthesis , 2010, J. Multim..

[11]  Firoj Alam Bangla Text to Speech using Festival , 2011 .

[12]  Joan Claudi Socoró,et al.  Towards High-Quality Next-Generation Text-to-Speech Synthesis: A Multidomain Approach by Automatic Domain Classification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Christophe d'Alessandro,et al.  Objective evaluation of grapheme to phoneme conversion for text-to-speech synthesis in French , 1998, Comput. Speech Lang..

[14]  Eric Moulines,et al.  Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[15]  Michael Picheny,et al.  The IBM expressive text-to-speech synthesis system for American English , 2006, IEEE Transactions on Audio, Speech, and Language Processing.