The waveform segment vocoder: A new approach for very-low-rate speech coding
暂无分享,去创建一个
We propose a new method of synthesis to be used for the segment vocoder, which transmits intelligible speech at rates below 300 b/s. The earlier segment vocoder applies LPC analysis to input speech, divides it into segments of variable duration, matches each segment with the nearest template from a codebook, concatenates at the receiver the set of nearest templates, and finally synthesizes the resultant sequence of speech frames using LPC synthesis. The quality of such a segment vocoder cannot exceed that of a standard unquantized LPC vocoder, which sounds buzzy due to the pulse/noise excitation used. Alternatively, by beginning with the waveforms (not the spectral representation) corresponding to the set of nearest templates, we can independently modify the pitch, energy, and duration of each template to match those of the input segment. These modified segments are then concatenated to produce the output waveform. We present here methods for high-quality modification of the pitch and duration of a segment of a speech waveform and show how these methods can be applied to improve the quality of the segment vocoder's output speech.
[1] Richard M. Schwartz,et al. A segment vocoder at 150 b/s , 1983, ICASSP.
[2] S. Roucos,et al. Segment quantization for very-low-rate speech coding , 1982, ICASSP.
[3] A. Wilgus,et al. High quality time-scale modification for speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.