Waveform-based speech synthesis approach with a formant frequency modification

A novel approach to speech synthesis based on waveform segments is proposed. One novel point of this approach is its new formant frequency modification algorithm which makes it possible to change formant frequency flexibly and so reproduce the desired speech quality. The algorithm characterizes speech formants not only by formant frequencies and formant bandwidths, but also by spectral intensities of formant frequencies. The desirable formant structure, which is specified by the parameters, is obtained by iteratively modifying the formant bandwidths. Using the specified formant structure, the speech signal is synthesized by the FFT (fast Fourier transform). Evaluation by the acoustic distance measure and by listening tests confirms the good performance of the approach. As evaluated by listening tests, the proposed method was found to increase significantly the naturalness of speech and clearly to increase speech quality.<<ETX>>

[1]  Tomohisa Hirokawa,et al.  High quality speech synthesis based on wavelet compilation of phoneme segments , 1992, ICSLP.

[2]  Satoshi Nakamura,et al.  Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[3]  Hisao Kuwabara A pitch-synchronous analysis/synthesis system to independently modify formant frequencies and bandwidths for voiced speech , 1984, Speech Commun..

[4]  Eric Moulines,et al.  Voice transformation using PSOLA technique , 1991, Speech Commun..

[5]  Eric Moulines,et al.  A diphone synthesis system based on time-domain prosodic modifications of speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.