Multilingual Voice Creation Toolkit for the MARY TTS Platform

This paper describes an open source voice creation toolkit that supports the creation of unit selection and HMM-based voices, for the MARY (Modular Architecture for Research on speech Synthesis) TTS platform. We aim to provide the tools and generic reusable run-time system modules so that people interested in supporting a new language and creating new voices for MARY TTS can do so. The toolkit has been successfully applied to the creation of British English, Turkish, Telugu and Mandarin Chinese language components and voices. These languages are now supported by MARY TTS as well as German and US English. The toolkit can be easily employed to create voices in the languages already supported by MARY TTS. The voice creation toolkit is mainly intended to be used by research groups on speech technology throughout the world, notably those who do not have their own pre-existing technology yet. We try to provide them with a reusable technology that lowers the entrance barrier for them, making it easier to get started. The toolkit is developed in Java and includes intuitive Graphical User Interface (GUI) for most of the common tasks in the creation of a synthetic voice.

[1]  Marcela Charfuelan,et al.  The MARY TTS entry in the Blizzard Challenge 2008 , 2008 .

[2]  Marc Schröder,et al.  Multilingual MARY TTS participation in the Blizzard Challenge 2009 , 2009 .

[3]  Thierry Dutoit,et al.  The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[5]  Paul Lamere,et al.  Sphinx-4: a flexible open source framework for speech recognition , 2004 .

[6]  Heiga Zen,et al.  The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.

[7]  E. Paulus,et al.  Speech Signal Processing , 1997, The Electrical Engineering Handbook - Six Volume Set.

[8]  Marc Schröder,et al.  Quality control of automatic labelling using HMM-based synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Mei-Yuh Hwang,et al.  The SPHINX-II speech recognition system: an overview , 1993, Comput. Speech Lang..