Improved Noise Characterization for Relative Impulse Response Estimation

Relative Impulse Responses (ReIRs) have several applications in speech enhancement, noise suppression and source localization for multi-channel speech processing in reverberant environments. Noise is usually assumed to be white Gaussian during the estimation of the ReIR between two microphones. We show that the noise in this system identification problem is instead dependent upon the microphone measurements and the ReIR itself. We then present modifications that incorporate this new noise model into three prevalent methods: Least Squares, Non-Stationary Frequency Domain and Sparse Bayesian Learning based approaches. We demonstrated improvements with an experimental study using real-world measurements in various noise environments.

[1]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[2]  Bhaskar D. Rao,et al.  Sparse Bayesian learning for basis selection , 2004, IEEE Transactions on Signal Processing.

[3]  Sharon Gannot,et al.  Time difference of arrival estimation of speech source in a noisy and reverberant environment , 2005, Signal Process..

[4]  Peter Vary,et al.  Multichannel audio database in various acoustic environments , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[5]  R. Snee,et al.  Ridge Regression in Practice , 1975 .

[6]  Sharon Gannot,et al.  Spatial Source Subtraction Based on Incomplete Measurements of Relative Transfer Function , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  Sharon Gannot,et al.  Relative transfer function modeling for supervised source localization , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[8]  Tao Zhang,et al.  Dynamic relative impulse response estimation using structured sparse Bayesian learning , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Bhaskar D. Rao,et al.  Type I and Type II Bayesian Methods for Sparse Signal Recovery Using Scale Mixtures , 2015, IEEE Transactions on Signal Processing.

[10]  Reinhold Häb-Umbach,et al.  Speech Enhancement With a GSC-Like Structure Employing Eigenvector-Based Transfer Function Ratios Estimation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[12]  Zbynek Koldovský,et al.  Noise reduction in dual-microphone mobile phones using a bank of pre-measured target-cancellation filters , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.