Rapid Development of TTS Corpora for Four South African Languages

This paper describes the development of text-to-speech corpora for four South African languages. The approach followed investigated the possibility of using low-cost methods including informal recording environments and untrained volunteer speakers. This objective and the additional future goal of expanding the corpus to increase coverage of South Africa’s 11 official languages necessitated experimenting with multi-speaker and code-switched data. The process and relevant observations are detailed throughout. The latest version of the corpora are available for download under an open-source licence and will likely see further development and refinement in future.

[1]  Osamuyimen Stewart,et al.  Designing interactive voice response (IVR) interfaces: localisation for low literacy users , 2009 .

[2]  Thomas Niesler,et al.  Comparing manually-developed and data-driven rules for P2P learning , 2009 .

[3]  Claire Halpert,et al.  Overlap-Driven Consequences of Nasal Place Assimilation , 2011 .

[4]  Sandrine Brognaux,et al.  HMM-Based Speech Segmentation: Improvements of Fully Automatic Approaches , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5]  Etienne Barnard,et al.  Phonetic alignment for speech synthesis in under-resourced languages , 2009, INTERSPEECH.

[6]  Marc Schröder,et al.  Multilingual Voice Creation Toolkit for the MARY TTS Platform , 2010, LREC.

[7]  Richard Sproat,et al.  Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla , 2016, SLTU.

[8]  Alan W. Black,et al.  The CMU Arctic speech databases , 2004, SSW.

[9]  Etienne Barnard,et al.  Lwazi II Final Report: Increasing the impact of speech technologies in South Africa , 2013 .

[10]  Jan P. H. van Santen,et al.  Methods for optimal text selection , 1997, EUROSPEECH.

[11]  T. A. Hall,et al.  English syllabification as the interaction of markedness constraints , 2006 .

[12]  Roald Eiselen,et al.  Developing Text Resources for Ten South African Languages , 2014, LREC.

[13]  Etienne Barnard,et al.  Validating smartphone-collected speech corpora , 2012, SLTU.

[14]  Mpho Raborife,et al.  Tone Labelling Algorithm for Sesotho , 2012 .

[15]  Van Niekerk,et al.  Experiments in rapid development of accurate phonetic alignments for TTS in Afrikaans , 2011 .

[16]  Etienne Barnard,et al.  The NCHLT speech corpus of the South African languages , 2014, SLTU.

[17]  Marelie H. Davel,et al.  Kullback-Leibler Divergence-Based ASR Training Data Selection , 2011, INTERSPEECH.