Free Tools and Resources for HMM-Based Brazilian Portuguese Speech Synthesis

Text-to-speech (TTS) is currently a mature technology used in many areas such as education and accessibility. Some modules of a TTS system depend on the language and, while there are many public materials for some languages (e.g., English and Japanese), the resources for Brazilian Portuguese (BP) are still limited. This work describes the development of a complete hidden Markov model (HMM) based TTS system for BP which can be applied to the desktop environment. It also releases a set of natural language processing tools for BP, which expands the already publicly available resources, supporting the development of new researches for academic or industrial purposes. Subjective and objective performance tests are presented, comparing the proposed TTS system with other softwares currently available for BP.

[1]  A. Maciel,et al.  Integration and evaluation of an HMM-based Text-To-Speech System to five , 2012, 2012 19th International Conference on Systems, Signals and Image Processing (IWSSIP).

[2]  Marc Schröder,et al.  The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[3]  Heiga Zen,et al.  Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Keiichi Tokuda,et al.  Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.

[5]  Thierry Dutoit,et al.  The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  Ranniery Maia,et al.  An Open Source HMM-based Text-to-Speech System for Brazilian Portuguese , 2010 .

[7]  Danilo Souza,et al.  A portability evaluation of Brazilian Portuguese voices produced with MARY TTS , 2014, IWSSIP 2014 Proceedings.

[8]  D. Braga,et al.  A rule-based grapheme-to-phone converter for tts systems in european portuguese , 2006, 2006 International Telecommunications Symposium.

[9]  Aldebaro Klautau,et al.  A Rule-based Syllabification Algorithm with Stress Determination for Brazilian Portuguese Natural Language Processing , 2011, ICPhS.

[10]  D.C. Silva,et al.  A rule-based grapheme-phone converter and stress determination for Brazilian Portuguese natural language processing , 2006, 2006 International Telecommunications Symposium.

[11]  João Antônio de Moraes,et al.  Freqüência de Ocorrência dos Fones e Listas de Frases Foneticamente Balanceadas no Português Falado no Rio de Janeiro DOI: 10.14209/jcis.1992.2 , 2015 .

[12]  Aldebaro Klautau,et al.  Um Framework para Desenvolvimento de Sistemas TTS Personalizados no Português do Brasil , 2012 .

[13]  Plínio Almeida Barbosa,et al.  Aiuruete: a high-quality concatenative text-to-speech system for brazilian portuguese with demisyllabic analysis-based units and a hierarchical model of rhythm production , 1999, EUROSPEECH.

[14]  Paul Taylor,et al.  Text-to-Speech Synthesis , 2009 .

[15]  Heiga Zen,et al.  An HMM-Based Brazilian Portuguese Speech Synthesizer and Its Characteristics DOI: 10.14209/jcis.2006.11 , 2015 .