A novel intonation model to improve the quality of tamil text-to-speech synthesis system

The global growth of Information and Communication technologies has a greater impact towards the research focus on speech technologies. Especially visually impaired people, vocally challenged people can utilize speech technology enabled devices as it helps them as a lifeline. In the broad sense, Speech technology has two major applications, speech synthesis and speech recognition. Speech synthesis is a popular technique to produce synthetic speech given the input text, whereas speech recognition is the technique that understands human speech and can produce either text or speech as output.

[1]  Shrikanth S. Narayanan,et al.  Exploiting Acoustic and Syntactic Features for Automatic Prosody Labeling in a Maximum Entropy Framework , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  T.Jayasankar,et al.  Automatic Continuous Speech Segmentation to Improve Tamil TexttoSpeech Synthesis , 2011 .

[3]  K Samudravijaya,et al.  Recent Advances of Speech Databases Development Activity for Indian Languages , 2006 .

[4]  Lianhong Cai,et al.  Modeling prosody patterns for Chinese expressive text-to-speech synthesis , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.

[5]  U. Maheswari Prosody Modeling Techniques for Text-to-Speech Synthesis Systems-A Survey , 2012 .

[6]  Heiga Zen,et al.  Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  B. Bharathi,et al.  SYLLABLE BASED CONTINUOUS SPEECH RECOGNITION FOR TAMIL LANGUAGE , 2016 .

[8]  Kishore Prahallad,et al.  Speech synthesis using approximate matching of syllables , 2008, 2008 IEEE Spoken Language Technology Workshop.

[9]  B. Yegnanarayana,et al.  Intonation component of a text-to-speech system for Hindi , 1993, Comput. Speech Lang..

[10]  Jing Zhu,et al.  Intonation and prosody conversion for expressive mandarin speech synthesis , 2012, 2012 IEEE 11th International Conference on Signal Processing.

[11]  Wei Zhang,et al.  Recent improvements of Probability Based Prosody Models for Unit Selection in concatenative Text-to-Speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Noureddine Ellouze,et al.  Design and Development of a Prosody Generator for Arabic TTS Systems , 2011 .

[13]  A. Black,et al.  1 Experiments with Unit Selection Speech Databases for Indian Languages , 2003 .

[14]  V. Ramu Reddy,et al.  Two-stage intonation modeling using feedforward neural networks for syllable based text-to-speech synthesis , 2013, Comput. Speech Lang..

[15]  Chung-Hsien Wu,et al.  Exploiting Prosody Hierarchy and Dynamic Features for Pitch Modeling and Generation in HMM-Based Speech Synthesis , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Parteek Kumar,et al.  Comparative study of text to speech system for Indian language , 2012 .

[17]  David Escudero Mancebo,et al.  Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence , 2012, Speech Commun..

[18]  A. Black,et al.  Semi-Supervised Learning of Acoustic Driven Prosodic Phrase Breaks for Text-to-Speech Systems , 2010 .

[19]  Youcef Tabet,et al.  Speech synthesis techniques. A survey , 2011, International Workshop on Systems, Signal Processing and their Applications, WOSSPA.

[20]  B. Raveendra Babu,et al.  Speech Synthesis System for Telugu Language , 2013 .

[21]  Hema A. Murthy,et al.  A new prosodic phrasing model for indian language telugu , 2004, INTERSPEECH.

[22]  Kishore Prahallad,et al.  Speech synthesis using artificial neural networks , 2010, 2010 National Conference On Communications (NCC).

[23]  K.C.Rajeswari,et al.  Developing Intonation Pattern for Tamil Text ToSpeech Synthesis System , 2014 .

[24]  Masatsune Tamura,et al.  Unit selection speech synthesis using multiple speech units at non-adjacent segments for prosody and waveform generation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.