A Swedish text-to-speech system based on an area function model

A complete text-to-speech system has been developed. This paper, however, is limited to a discussion of some extensions to the area function approach to speech synthesis. The voiced source uses a pulse, shaped by two sinusoids, to represent the opening and the closing phases of one glottal period. Moreover, to model the motion of the vocal cords, one reflection coefficient is made time-varying in accordance with the glottal wave. Thereby the formants are time-varying over a pitch cycle. For the production of fricative sounds, the unvoiced source can be moved to different positions in the vocal tract.

[1]  Dennis H. Klatt,et al.  The klattalk text-to-speech conversion system , 1982, ICASSP.

[2]  R. Miller Nature of the Vocal Cord Wave , 1956 .

[3]  Sheri Hunnicutt,et al.  A multi-language text-to-speech module , 1982, ICASSP.

[4]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[5]  John Nicholas Holmes,et al.  Speech synthesis , 1972 .