It is shown that a recently proposed technique for high-quality waveform manipulation can be formulated as a pitch-excited vocoder. This waveform vocoder produces high-quality speech over a wide range of prosodic modifications, showing that natural sounding speech can be produced using an impulse driven linear synthesis model. In a pilot experiment, waveform vocoding techniques were applied on the LPC (linear predictive coding) residue to investigate the relative importance of amplitude and phase in the synthesis of male and female voices. It was found that the amplitude information contributed more to speech quality than phase information, and that, for male voices, amplitude information alone was sufficient to make the synthetic speech quality almost indistinguishable from that of natural speech.<<ETX>>
[1]
Alan V. Oppenheim,et al.
Predictive coding in a homomorphic vocoder
,
1971
.
[2]
Eric Moulines,et al.
Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
,
1989,
Speech Commun..
[3]
Man Mohan Sondhi,et al.
A nonlinear articulatory speech synthesizer using both time- and frequency-domain elements
,
1986,
ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4]
Hiroya Fujisaki,et al.
A speech analysis-synthesis system based on the ARMA model and its evaluation
,
1986,
ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.