Multi-speaker DOA estimation in reverberation conditions using expectation-maximization

A novel direction of arrival (DOA) estimator for concurrent speakers in reverberant environment is presented. Reverberation, if not properly addressed, is known to degrade the performance of DOA estimators. In our contribution, the DOA estimation task is formulated as a maximum likelihood (ML) problem, which is solved using the expectation-maximization (EM) procedure. The received microphone signals are modelled as a sum of anechoic and reverberant components. The reverberant components are modelled by a timeinvariant coherence matrix multiplied by time-varying reverberation power spectral density (PSD). The PSDs of the anechoic speech and reverberant components are estimated as part of the EM procedure. It is shown that the DOA estimates, obtained by the proposed algorithm, are less affected by reverberation than competing algorithms that ignore the reverberation. Experimental study demonstrates the benefit of the presented algorithm in reverberant environment using measured room impulse responses (RIRs).

[1]  E. Habets,et al.  Generating sensor signals in isotropic noise fields. , 2007, The Journal of the Acoustical Society of America.

[2]  Jesper Jensen,et al.  Maximum likelihood based multi-channel isotropic reverberation reduction for hearing aids , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[3]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[4]  Sharon Gannot,et al.  Speaker Tracking Using Recursive EM Algorithms , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5]  Thomas Hofmann,et al.  An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments , 2007 .

[6]  Peter Vary,et al.  Multichannel audio database in various acoustic environments , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[7]  Daniel P. W. Ellis,et al.  Model-Based Expectation-Maximization Source Separation and Localization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Hao Ye,et al.  Maximum likelihood DOA estimation and asymptotic Cramer-Rao bounds for additive unknown colored noise , 1995, IEEE Trans. Signal Process..

[9]  Neviano Dal Degan,et al.  Acoustic noise analysis and speech enhancement techniques for mobile radio applications , 1988 .

[10]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[11]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .