Blind Channel Magnitude Response Estimation in Speech Using Spectrum Classification

We present an algorithm for blind estimation of the magnitude response of an acoustic channel from single microphone observations of a speech signal. The algorithm employs channel robust RASTA filtered Mel-frequency cepstral coefficients as features to train a Gaussian mixture model based classifier and average clean speech spectra are associated with each mixture; these are then used to blindly estimate the acoustic channel magnitude response from speech that has undergone spectral modification due to the channel. Experimental results using a variety of simulated and measured acoustic channels and additive babble noise, car noise and white Gaussian noise are presented. The results demonstrate that the proposed method is able to estimate a variety of channel magnitude responses to within an Itakura distance of dI ≤0.5 for SNR ≥10 dB.

[1]  T. M. Cannon,et al.  Blind deconvolution through digital signal processing , 1975, Proceedings of the IEEE.

[2]  A. Oppenheim,et al.  Nonlinear filtering of multiplied and convolved signals , 1968 .

[3]  Mike Brookes,et al.  Blind channel identification in speech using the Long-Term Average Speech Spectrum , 2009, 2009 17th European Signal Processing Conference.

[4]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[5]  Patrick A. Naylor,et al.  Speech Dereverberation , 2010 .

[6]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[7]  T. Houtgast,et al.  The concept of signal-to-noise ratio in the modulation domain and speech intelligibility. , 2008, The Journal of the Acoustical Society of America.

[8]  S.J. Wenndt,et al.  Blind channel estimation for audio signals , 2004, 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No.04TH8720).

[9]  Jacob Benesty,et al.  Identification of acoustic MIMO systems: Challenges and opportunities , 2006, Signal Process..

[10]  L. Tong,et al.  Multichannel blind identification: from subspace to maximum likelihood methods , 1998, Proc. IEEE.

[11]  Mike Brookes,et al.  Single-microphone blind channel identification in speech using spectrum classification , 2011, 2011 19th European Signal Processing Conference.

[12]  Patrick A. Naylor,et al.  A Class of Sparseness-Controlled Algorithms for Echo Cancellation , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Mike Brookes,et al.  Non intrusive codec identification algorithm , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Paul R. White,et al.  Analysis of the maximum likelihood, total least squares and principal component approaches for frequency response function estimation , 2006 .

[15]  Israel Cohen,et al.  On Multiplicative Transfer Function Approximation in the Short-Time Fourier Transform Domain , 2007, IEEE Signal Processing Letters.

[16]  James R. Hopgood,et al.  Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker , 2010, Speech Dereverberation.

[17]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[18]  Peter J. W. Rayner,et al.  Blind single channel deconvolution using nonstationary signal processing , 2003, IEEE Trans. Speech Audio Process..

[19]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[20]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[21]  Richard C. Hendriks,et al.  Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.