An intonation model for TTS in sepedi

We present an initial investigation into the acoustic realisation of tone in continuous utterances in Sepedi (a language in the Southern Bantu family). An analytic model for the generation of appropriate pitch contours given an utterance with linguistic tone specification is presented and evaluated. By comparing the model output to speech data from a small tone-marked corpus we conclude that the initial implementation presented here is capable of generating pitch contours exhibiting some realistic properties and identify a number of aspects that require further attention. Lastly, we present some initial perceptual results when integrating the proposed model into a Hidden Markov Model-based speech synthesis system. Index Terms: speech synthesis, tone languages, Sepedi

[1]  Etienne Barnard,et al.  From tone to pitch in Sepedi , 2010, SLTU.

[2]  Keiichi Tokuda,et al.  Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Etienne Barnard,et al.  Word-level prosody in Sotho-Tswana , 2010 .

[4]  Keiichi Tokuda,et al.  Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[6]  Keiichi Tokuda,et al.  A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..

[7]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[8]  E. Barnard,et al.  Phonetics of intonation in South African Bantu languages , 2008 .

[9]  H. Zen,et al.  An HMM-based speech synthesis system applied to English , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[10]  E. Barnard,et al.  Realisations of a single high tone in Northern Sotho , 2010 .

[11]  Kei Hashimoto,et al.  Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2010 , 2009 .

[12]  Heiga Zen,et al.  The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.

[13]  Etienne Barnard,et al.  Phonetic alignment for speech synthesis in under-resourced languages , 2009, INTERSPEECH.

[14]  Keiichi Tokuda,et al.  Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.