A robust DOA estimation method for a linear microphone array under reverberant and noisy environments

A robust method for linear array is proposed to address the difficulty of direction-of-arrival (DOA) estimation in reverberant and noisy environments. A direct-path dominance test based on the onset detection is utilized to extract time-frequency bins containing the direct propagation of the speech. The influence of the transient noise, which severely contaminates the onset test, is mitigated by a proper transient noise determination scheme. Then for voice features, a two-stage procedure is designed based on the extracted bins and an effective dereverberation method, with robust but possibly biased estimation from middle frequency bins followed by further refinement in higher frequency bins. The proposed method effectively alleviates the estimation bias caused by the linear arrangement of microphones, and has stable performance under noisy and reverberant environments. Experimental evaluation using a 4-element microphone array demonstrates the efficacy of the proposed method.

[1]  Simon Doclo,et al.  Multi-microphone noise reduction and dereverberation techniques for speech applications , 2003 .

[2]  Marc Delcroix,et al.  Dereverberation and Denoising Using Multichannel Linear Prediction , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Ralph Otto Schmidt,et al.  A signal subspace approach to multiple emitter location and spectral estimation , 1981 .

[4]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[5]  I. Cohen,et al.  Multichannel signal detection based on the transient beam-to-reference ratio , 2003, IEEE Signal Processing Letters.

[6]  C. Faller,et al.  Source localization in complex listening situations: selection of binaural cues based on interaural coherence. , 2004, The Journal of the Acoustical Society of America.

[7]  Boaz Rafaely,et al.  Localization of Multiple Speakers under High Reverberation using a Spherical Microphone Array and the Direct-Path Dominance Test , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Haizhou Li,et al.  Weighted Spatial Covariance Matrix Estimation for MUSIC Based TDOA Estimation of Speech Source , 2017, INTERSPEECH.

[9]  Laurent Girin,et al.  Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Michael S. Brandstein,et al.  A practical methodology for speech source localization with microphone arrays , 1997, Comput. Speech Lang..

[11]  James R. Hopgood,et al.  A Time–Frequency Masking Based Random Finite Set Particle Filtering Method for Multiple Acoustic Source Detection and Tracking , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[12]  Jacob Benesty,et al.  Time Delay Estimation and Source Localization , 2008 .

[13]  Ivan Tashev,et al.  Sound Capture and Processing: Practical Approaches , 2009 .

[14]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[15]  Guy J. Brown,et al.  A Robust Dual-Microphone Speech Source Localization Algorithm for Reverberant Environments , 2016, INTERSPEECH.

[16]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[17]  H S Colburn,et al.  The precedence effect. , 1999, The Journal of the Acoustical Society of America.

[18]  Jean Rouat,et al.  Robust sound source localization using a microphone array on a mobile robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[19]  B.D. Van Veen,et al.  Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[20]  Hong-Goo Kang,et al.  Online Speech Dereverberation Algorithm Based on Adaptive Multichannel Linear Prediction , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[21]  Tomohiro Nakatani,et al.  Dereverberation for reverberation-robust microphone arrays , 2013, 21st European Signal Processing Conference (EUSIPCO 2013).

[22]  Emanuel A. P. Habets,et al.  Broadband doa estimation using convolutional neural networks trained with noise signals , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[23]  Tomohiro Nakatani,et al.  Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[24]  Benesty,et al.  Adaptive eigenvalue decomposition algorithm for passive acoustic source localization , 2000, The Journal of the Acoustical Society of America.

[25]  Reinhold Häb-Umbach,et al.  On the Bias of Direction of Arrival Estimation Using Linear Microphone Arrays , 2016, ITG Symposium on Speech Communication.

[26]  Biing-Hwang Juang,et al.  Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[28]  Boaz Rafaely,et al.  Speaker localization using the direct-path dominance test for arbitrary arrays , 2018, 2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE).

[29]  S.C. Douglas,et al.  Multichannel blind deconvolution and equalization using the natural gradient , 1997, First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications.

[30]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[31]  Patrick M. Zurek,et al.  The Precedence Effect , 1987 .

[32]  Dorothea Kolossa,et al.  Speaker localization in a reverberant environment using spherical statistical modeling , 2017 .