Nonstationary spectral modeling of voiced speech
暂无分享,去创建一个
The main purpose of this paper is to present a novel model for voiced speech. The classical model, which is being used in many applications, assumes local stationarity, and consequently imposes a simple and well known line structure to the short-time spectrum of voiced speech. The model derived in this paper allows for local non-stationarities not only in terms of pitch perturbations, but in terms of vocal tract variations as well. The resulting structure of the short-time spectrum becomes more complex, but can still be interpreted in terms of generalized lines. The proposed model supports new forms of spectral prediction, which can be put to advantage in speech coding applications. Experimental results are presented supporting the validity of both the model itself and the prediction relationships. Finally, a new class of speech coders, denoted harmonic coders, based on the presented model, is proposed, and a specific implementation is presented.
[1] José M. Tribolet,et al. A model for short-time phase prediction of speech , 1981, ICASSP.
[2] Ronald E. Crochiere,et al. Frequency domain coding of speech , 1979 .
[3] Ronald W. Schafer,et al. Real-time digital hardware pitch detector , 1976 .
[4] M. Portnoff. Short-time Fourier analysis of sampled speech , 1981 .
[5] Joel Max,et al. Quantizing for minimum distortion , 1960, IRE Trans. Inf. Theory.