论文信息 - Japanese text-to-speech synthesizer based on residual excited speech synthesis

Japanese text-to-speech synthesizer based on residual excited speech synthesis

A Japanese text-to-speech conversion technology has been developed, where a new text analysis method and a new speech synthesis method have been employed to improve pronunciation of Kanji characters and phoneme articulation. Morpheme analysis is first performed for an input text, and pronunciation and grammatical information are extracted. For word identification in Kanji character strings, an efficient method based on Japanese word statistics is introduced, and the text is converted into Kana strings, Kana strings are represented with a string of Dyad-type units(CV,VC,VV). Speech parameters for the Dyad-type units are concatenated to produce continuous speech. Residual signal is used as an excitation source for LSP synthesis in unvoiced consonant periods. An articulation test has shown that the residual signal is more effective than a noise signal and significantly improves explosive consonant synthesis.

K. Hakoda | T. Hirahara | K. Kabeya | K. Nagakura

[1] Rolf Carlson,et al. MITalk‐79: The 1979 MIT text‐to‐speech system , 1979 .