Spectral Peak-Weighted Liftering of Cepstral Coefficients for Speech Recognition

In this paper, we propose a peak-weighted cepstral lifter (PWL) for enhancing the spectral peaks of an all-pole model spectrum in the cepstral domain. The design parameter of the PWL is the degree of pole enhancement or pole shifting toward the unit circle. The optimal pole shifting factor is chosen by considering the sensitivity to spectral resonance peaks, the variability of cepstral variances, and the recognition accuracy. Next, we generalize the PWL so that the optimal shifting factor is adaptively determined in frame-by-frame basis. Compared with other cepstral lifters, a speech recognizer employing the frame-adaptive PWL provides better recognition performance. key words: speech recognition, cepstral analysis, peak-weighted cepstral lifter, frame-adaptive cepstral lifter

[1]  M. Schroeder Direct (nonrecursive) relations between cepstrum and predictor coefficients , 1981 .

[2]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[3]  Biing-Hwang Juang,et al.  On the use of bandpass liftering in speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[4]  F. Itakura,et al.  Spectral smoothing technique in PARCOR speech analysis-synthesis , 1978 .

[5]  Hynek Hermansky,et al.  Evaluation and optimization of perceptually-based ASR front-end , 1993, IEEE Trans. Speech Audio Process..

[6]  Brian A. Hanson,et al.  Spectral slope distance measures with linear prediction analysis for word recognition in noise , 1987, IEEE Trans. Acoust. Speech Signal Process..

[7]  B. Yegnanarayana,et al.  Significance of group delay functions in signal reconstruction from spectral magnitude or phase , 1984 .

[8]  Kuldip K. Paliwal,et al.  On the performance of the quefrency-weighted cepstral coefficients in vowel recognition , 1982, Speech Commun..

[9]  Fumitada Itakura,et al.  Distance measure for speech recognition based on the smoothed group delay spectrum , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  H. Wakita,et al.  A comparative study of cepstral lifters and distance measures for all pole models of speech in noise , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[11]  T.H. Crystal,et al.  Linear prediction of speech , 1977, Proceedings of the IEEE.

[12]  L. R. Rabiner,et al.  On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition , 1983, The Bell System Technical Journal.

[13]  Yoh'ichi Tohkura,et al.  A weighted cepstral distance measure for speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..