Duration modeling for arabic text to speech synthesis

Duration modeling is a fundamental task of prosody generation for Text To Speech (TTS) systems. The objective of this task is to predict the duration of a speech unit from its phonological representation. Duration modeling has a significant influence on the intelligibility and the naturalness of the synthesized speech. This paper presents a Neural Network (NN) based approach to predict the duration of Arabic phonemes. The developed model utilizes neural networks to map the relation between the phonological features and duration values.