Towards the development of a brazilian portuguese text-to-speech system based on HMM

This paper describes the development of a Brazilian Portuguese text-to-speech system which applies a technique wherein speech is directly synthesized from hidden Markov models. In order to build the synthesizer a speech database was recorded and phonetically segmented. Furthermore, contextual informations about syllables, words, phrases, and utterances were determined, as well as questions for decision tree-based context clustering algorithms. The resulting system presents a fair reproduction of the prosody even when a small database is used for training.

[1]  Eleonora Cavalcante Albano,et al.  Archisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  Keiichi Tokuda,et al.  Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Keiichi Tokuda,et al.  An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Keiichi Tokuda,et al.  Speaker interpolation in HMM-based speech synthesis system , 1997, EUROSPEECH.

[5]  Keiichi Tokuda,et al.  Multi-Space Probability Distribution HMM , 2002 .

[6]  Luiz Antonio Sacconi Nossa gramática : teoria e prática , 1999 .

[7]  Keiichi Tokuda,et al.  Eigenvoices for HMM-based speech synthesis , 2002, INTERSPEECH.

[8]  Keiichi Tokuda,et al.  Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.