The sinusoidal model has been applied to a broad range of speech and audio coding, such as analysis/synthesis, time-and frequency-scale modification, fundamental frequency modification, speech enhancement, and co-channel separation. In this paper, we propose a multiresolution analysis-by-synthesis/overlap-add (ABS/OLA) sinusoidal model using a wavelet transform. In the proposed scheme, after decomposing an input speech signal into multiresolution subband signals using the wavelet transform, classical ABS/OLA sinusoidal models with different window lengths are applied to each subband signals respectively. It is shown that by applying a proper-sized analysis window, much more accurate sinusoidal components can be estimated. Experimental results have shown that the proposed multiresolution ABS/OLA sinusoidal model can achieve better performance than that of the classical ABS/OLA sinusoidal model in terms of the spectral characteristics, phase characteristics, and the quality of synthetic speech.
[1]
Thomas F. Quatieri,et al.
An approach to co-channel talker interference suppression using a sinusoidal model for speech
,
1990,
IEEE Trans. Acoust. Speech Signal Process..
[2]
Andries P. Hekstra,et al.
Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs
,
2001,
2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[3]
R. J. McAulay,et al.
Speech transformations based on a sinusoidal representation
,
1985,
ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4]
Daubechies,et al.
Ten Lectures on Wavelets Volume 921
,
1992
.