An Approach to Design an Intelligent Parametric Synthesizer for Emotional Speech

Speech synthesizer is an artificial system to produce speech. But the generation of emotional speech is a difficult task. Though many researchers have been working on this area since a long period, still it is a challenging problem in terms of accuracy. The objective of our work is to design an intelligent model for emotional speech synthesis. An attempt is taken to compute such system using rule based fuzzy model. Initially the required parameters have been considered for the model and are extracted as features. The features are analyzed for each speech segment. At the synthesis level the model has been trained with these parameters properly. Next to it, it has been tested. The tested results show its performance.

[1]  Alan W. Black Unit selection and emotional speech , 2003, INTERSPEECH.

[2]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Paavo Alku,et al.  HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Heiga Zen,et al.  Autoregressive Models for Statistical Parametric Speech Synthesis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Marc Schröder,et al.  Voice quality interpolation for emotional text-to-speech synthesis , 2005, INTERSPEECH.

[6]  Yannis Stylianou,et al.  Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..

[7]  B. Atal,et al.  Speech analysis and synthesis by linear prediction of the speech wave. , 1971, The Journal of the Acoustical Society of America.

[8]  E. A. Flinn Comments on “Speech Analysis and Synthesis by Linear Prediction of the Speech Wave” [B. S. Atal and S. L. Hanauer, J. Acoust. Soc. Amer. 50, 637–655 (1971)] , 1972 .

[9]  Wei Zhang,et al.  Prosody analysis and modeling for emotional speech synthesis , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10]  Urmila Shrawankar,et al.  Parameters Optimization for Improving ASR Performance in Adverse Real World Noisy Environmental Conditions , 2013, ArXiv.

[11]  Adnan Cherif,et al.  High Quality Arabic Concatenative Speech Synthesis , 2011 .

[12]  Csaba Pléh,et al.  Ascribing emotions depending on pause length in native and foreign language speech , 2014, Speech Commun..