Performance analysis of the covariance-whitening and the covariance-subtraction methods for estimating the relative transfer function

Estimation of the relative transfer functions (RTFs) vector of a desired speech source is a fundamental problem in the design of data-dependent spatial filters. We present two common estimation methods, namely the covariance-whitening (CW) and the covariance-subtraction (CS) methods. The CW method has been shown in prior work to outperform the CS method. However, thus far its performance has not been analyzed. In this paper, we analyze the performance of the CW and CS methods and show that in the cases of spatially white noise and of uniform powers of desired speech source and coherent interference over all microphones, the CW method is superior. The derivations are validated by comparing them to their empirical counterparts in Monte Carlo experiments. In fact, the CW method outperforms the CS method in all tested scenarios, although there may be rare scenarios for which this is not the case.

[1]  Xiaodong Li,et al.  Statistical Analysis of the Multichannel Wiener Filter Using a Bivariate Normal Distribution for Sample Covariance Matrices , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[2]  O. L. Frost,et al.  An algorithm for linearly constrained adaptive array processing , 1972 .

[3]  Israel Cohen,et al.  Relative transfer function identification using speech signals , 2004, IEEE Transactions on Speech and Audio Processing.

[4]  B.D. Van Veen,et al.  Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[5]  Israel Cohen,et al.  Multichannel Eigenspace Beamforming in a Reverberant Noisy Environment With Multiple Interfering Speech Signals , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Sharon Gannot,et al.  Geometrically Constrained TRINICON-based relative transfer function estimation in underdetermined scenarios , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[7]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[8]  Marc Moonen,et al.  Distributed Node-Specific LCMV Beamforming in Wireless Sensor Networks , 2012, IEEE Transactions on Signal Processing.

[9]  Zbynek Koldovský,et al.  Sparse target cancellation filters with application to semi-blind noise extraction , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Marc Moonen,et al.  Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[11]  G. Stewart,et al.  Matrix Perturbation Theory , 1990 .

[12]  Emmanuel Vincent,et al.  A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  Marc Moonen,et al.  Speech Distortion Weighted Multichannel Wiener Filtering Techniques for Noise Reduction , 2005 .

[14]  Petre Stoica,et al.  Eigenelement Statistics of Sample Covariance Matrix in the Correlated Data Case , 1997, Digit. Signal Process..

[15]  Sharon Gannot,et al.  Time difference of arrival estimation of speech source in a noisy and reverberant environment , 2005, Signal Process..

[16]  N. R. Goodman Statistical analysis based on a certain multivariate complex Gaussian distribution , 1963 .

[17]  Sharon Gannot,et al.  Performance analysis of the covariance subtraction method for relative transfer function estimation and comparison to the covariance whitening method , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Walter Kellermann,et al.  Adaptive Beamforming for Audio Signal Acquisition , 2003 .