论文信息 - Speech synthesis from text

Speech synthesis from text

Text analysis for speech synthesis is described in relation to the information needed in speech production. This includes a pronouncing dictionary and letter-to-sound rules, morphological analysis and accent assignment, and syntactic analysis. Prosody control rules (fundamental frequency control and segmental duration control) are examined. Speech units for synthesis and parametric representation of speech signals are discussed. Applications and development tools are considered.<<ETX>>

Y. Sagisaka | Y. Sagisaka

[1] Haruo Kubozono. The organization of Japanese prosody , 1987 .

[2] Michael Riley. Statistical tree‐based modeling of phonetic segment durations , 1989 .

[3] N. Thorsen,et al. Sentence intonation in textual context--supplementary data. , 1986, The Journal of the Acoustical Society of America.

[4] C. d'Alessandro,et al. Decomposition of the speech signal into short-time waveforms using spectral segmentation , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5] Hirohisa Gambe,et al. An approach to LSI implementation of a 2B1Q coded echo canceler for ISDN subscriber loop transmission , 1989, IEEE International Conference on Communications, World Prosperity Through Communications,.

[6] Hiroya Fujisaki,et al. Proposal and evaluation of models for the glottal source waveform , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7] D. O'Shaughnessy,et al. Linguistic modality effects on fundamental frequency in speech. , 1983, The Journal of the Acoustical Society of America.

[8] Y. Sagisaka,et al. Word identification method for Japanese text-to-speech conversion system , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Yoshinori Sagisaka,et al. On sentential effects in the control of segmental duration in Japanese , 1988 .

[10] S. Nakajima,et al. Automatic generation of synthesis units based on context oriented clustering , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11] K. Hakoda,et al. Japanese text-to-speech synthesizer based on residual excited speech synthesis , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12] J. Olive,et al. Rule synthesis of speech from dyadic units , 1977 .

[13] S.R. Hertz,et al. The delta rule development system for speech synthesis from text , 1985, Proceedings of the IEEE.

[14] K. Hakoda,et al. Japanese text-to-speech synthesizer , 1988 .

[15] D H Klatt,et al. Review of text-to-speech conversion for English. , 1987, The Journal of the Acoustical Society of America.

[16] O. Fujimura,et al. A demisyllable inventory for speech synthesis , 1979 .

[17] Hisashi Kawai,et al. Realization of linguistic information in the voice fundamental frequency contour of the spoken Japanese , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[18] Eric Moulines,et al. A diphone synthesis system based on time-domain prosodic modifications of speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[19] D. Ladd. Declination ‘‘reset’’ and the hierarchical organization of utterances , 1988 .

[20] Diagnostic tests of segmental duration models , 1989 .

[21] Y. Sagisaka,et al. Speech synthesis by rule using an optimal selection of non-uniform synthesis units , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.