Traditionally, a very simple model for short-time homomorphic analysis has been used. It is shown that there is no theoretical justification for applying this model to voiced speech and that the model is of limited value for improving cepstral deconvolution procedures. Consequently, a more elaborate model is introduced in which the influence of window length is approximated and the spectral sampling inherent in voiced speech is explicitly represented. As a result, this new model shows that the vocal tract contribution to the complex cepstrum is repeated at every multiple of the pitch quefrency (n p ) and is multiplied by a double sinclike distortion (D(n)). It is shown that in order to achieve deconvolution with a low-time gating system, a cepstral lifter of length n p /2 should be used (instead of the usual length "less than n p "). Furthermore, the lifter should compensate for the distortion D(n). Unfortunately, the accuracy of straightforward homomorphic deconvolution approximations is limited by aliasing distortion which results from the repeated nature of the vocal tract contribution. Nevertheless, reasonable deconvolution approximations are obtained.
[1]
D.P. Skinner,et al.
The cepstrum: A guide to processing
,
1977,
Proceedings of the IEEE.
[2]
A. Oppenheim,et al.
Homomorphic analysis of speech
,
1968
.
[3]
Jr. T. Quatieri.
Minimum and mixed phase speech analysis-synthesis by adaptive homomorphic deconvolution
,
1979
.
[4]
Alan V. Oppenheim,et al.
Short-time homomorphic analysis
,
1977
.
[5]
L. Rabiner,et al.
System for automatic formant analysis of voiced speech.
,
1970,
The Journal of the Acoustical Society of America.
[6]
Alan V. Oppenheim,et al.
Predictive coding in a homomorphic vocoder
,
1971
.
[7]
A. Noll.
Cepstrum pitch determination.
,
1967,
The Journal of the Acoustical Society of America.
[8]
K. Steiglitz.
On the simultaneous estimation of poles and zeros in speech analysis
,
1977
.