论文信息 - Duration modeling for arabic text to speech synthesis

Duration modeling for arabic text to speech synthesis

Duration modeling is a fundamental task of prosody generation for Text To Speech (TTS) systems. The objective of this task is to predict the duration of a speech unit from its phonological representation. Duration modeling has a significant influence on the intelligibility and the naturalness of the synthesized speech. This paper presents a Neural Network (NN) based approach to predict the duration of Arabic phonemes. The developed model utilizes neural networks to map the relation between the phonological features and duration values.

Mohsen Rashwan | Yasser Hifny | Yasser Hifny | M. Rashwan

[1] K. D. Jong,et al. Stress, duration, and intonation in Arabic word-level prosody , 1999 .

[2] S. Al-Ani. Arabic Phonology: An Acoustical and Physiological Investigation , 1970 .

[3] Richard Sproat,et al. Multilingual Text-to-Speech Synthesis: The Bell Labs Approach , 1998, CL.

[4] Mohsen Rashwan,et al. Concatenative arabic speech synthesis using large speech database , 2000, INTERSPEECH.

[5] David B. Pisoni,et al. Text-to-speech: the mitalk system , 1987 .

[6] Marcel Riedi,et al. Modeling segmental duration with multivariate adaptive regression splines , 1997, EUROSPEECH.

[7] L. F. Brosnahan. The Sounds Of Language , 1961 .

[8] Marcel Riedi,et al. A neural-network-based model of segmental duration for speech synthesis , 1995, EUROSPEECH.

[9] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[10] Leo Breiman,et al. Classification and Regression Trees , 1984 .

[11] Thomas K. Davis,et al. The Study of Language: The sounds of language , 2005 .

[12] Eric Moulines,et al. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[13] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[14] J. Friedman. Multivariate adaptive regression splines , 1990 .