论文信息 - Generating fundamental frequency contours for speech synthesis in yorùbá

Generating fundamental frequency contours for speech synthesis in yorùbá

We present methods for modelling and synthesising fundamental frequency (F0) contours suitable for application in textto-speech (TTS) synthesis of Yoruba (an African tone language). These methods are discussed and compared with a baseline approach using the HMM-based speech synthesis system HTS. Evaluation is done by comparing ten-fold cross validation squared errors on a small corpus of four speakers. We show that the proposed methods are relatively effective at modelling and generating F0 contours in this context, achieving lower error rates than the baseline. These results suggest that our methods will be useful for the generation of improved synthesis of tone in African languages, which has been a challenge to date.

Etienne Barnard | Daniel R. van Niekerk | E. Barnard | D. V. Niekerk

[1] D. Robert Ladd,et al. Aspects of pitch realisation in Yoruba , 1990, Phonology.

[2] Dafydd Gibbon,et al. Towards an unrestricted domain TTS system for African tone languages , 2008, Int. J. Speech Technol..

[3] Karen Courtenay. Yoruba: A 'terraced-level' language with three tonemes , 2010 .

[4] Yi Xu,et al. Speech melody as articulatorily implemented communicative functions , 2005, Speech Commun..

[5] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Etienne Barnard,et al. A general-purpose IsiZulu speech synthesizer , 2005 .

[7] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .

[8] Etienne Barnard,et al. Pronunciation prediction with Default&Refine , 2008, Comput. Speech Lang..

[9] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.

[10] Santitham Prom-on,et al. Modeling tone and intonation in Mandarin and English as a process of target approximation. , 2009, The Journal of the Acoustical Society of America.

[11] Etienne Barnard,et al. Predicting utterance pitch targets in Yorùbá for tone realisation in speech synthesis , 2014, Speech Commun..

[12] Etienne Barnard,et al. Tone realisation in a yorùbá speech recognition corpus , 2012, SLTU.

[13] George N. Clements,et al. Downstep and high raising: interacting factors in Yoruba tone production , 2003, J. Phonetics.