Fast Implementation of KLT-Based Speech Enhancement Using Vector Quantization

We propose a new method for implementing Karhunen-Loeve transform (KLT)-based speech enhancement to exploit vector quantization (VQ). The method is suitable for real-time processing. The proposed method consists of a VQ learning stage and a filtering stage. In the VQ learning stage, the autocorrelation vectors comprising the first K elements of the autocorrelation function are extracted from learning data. The autocorrelation vectors are used as codewords in the VQ codebook. Next, the KLT bases that correspond to all the codeword vectors are estimated through eigendecomposition (ED) of the empirical Toeplitz covariance matrices constructed from the codeword vectors. In the filtering stage, the autocorrelation vectors that are estimated from the input signal are compared to the codewords. The nearest one is chosen in each frame. The precomputed KLT bases corresponding to the chosen codeword are used for filtering instead of performing ED, which is computationally intensive. Speech quality evaluation using objective measures shows that the proposed method is comparable to a conventional KLT-based method that performs ED in the filtering process. Results of subjective tests also support this result. In addition, processing time is reduced to about 1/66 that of the conventional method in the case where a frame length of 120 points is used. This complexity reduction is attained after the computational cost in the learning stage and a corresponding increase in the associated memory requirement. Nevertheless, these results demonstrate that the proposed method reduces computational complexity while maintaining the speech quality of the KLT-based speech enhancement

[1]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[2]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[3]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[4]  Nam C. Phamdo,et al.  Signal/noise KLT based approach for enhancing speech degraded by colored noise , 2000, IEEE Trans. Speech Audio Process..

[5]  Jhing-Fa Wang,et al.  Noise suppression based on approximate KLT with wavelet packet expansion , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  C. K. Yuen,et al.  Digital spectral analysis , 1979 .

[7]  Y. Ephraim,et al.  Extension of the signal subspace speech enhancement approach to colored noise , 2003, IEEE Signal Processing Letters.

[8]  Jun Huang,et al.  A DCT-based fast signal subspace technique for robust speech recognition , 2000, IEEE Trans. Speech Audio Process..

[9]  Lawrence R. Rabiner,et al.  A modified K-means clustering algorithm for use in isolated work recognition , 1985, IEEE Trans. Acoust. Speech Signal Process..

[10]  V. Kroupa,et al.  Digital spectral analysis , 1983, Proceedings of the IEEE.

[11]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..