Sound source localization and speech enhancement with sparse Bayesian learning beamforming.

Speech localization and enhancement involves sound source mapping and reconstruction from noisy recordings of speech mixtures with microphone arrays. Conventional beamforming methods suffer from low resolution, especially with a limited number of microphones. In practice, there are only a few sources compared to the possible directions-of-arrival (DOA). Hence, DOA estimation is formulated as a sparse signal reconstruction problem and solved with sparse Bayesian learning (SBL). SBL uses a hierarchical two-level Bayesian inference to reconstruct sparse estimates from a small set of observations. The first level derives the posterior probability of the complex source amplitudes from the data likelihood and the prior. The second level tunes the prior towards sparse solutions with hyperparameters which maximize the evidence, i.e., the data probability. The adaptive learning of the hyperparameters from the data auto-regularizes the inference problem towards sparse robust estimates. Simulations and experimental data demonstrate that SBL beamforming provides high-resolution DOA maps outperforming traditional methods especially for correlated or non-stationary signals. Specifically for speech signals, the high-resolution SBL reconstruction offers not only speech enhancement but effectively speech separation.

[1]  Jens Meyer,et al.  Beamforming for a circular microphone array mounted on spherically shaped objects , 2001 .

[2]  Peter Gerstoft,et al.  Block-sparse beamforming for spatially extended sources in a Bayesian formulation. , 2016, The Journal of the Acoustical Society of America.

[3]  Wei-Ping Zhu,et al.  Direction of Arrival Estimation for Off-Grid Signals Based on Sparse Bayesian Learning , 2016, IEEE Sensors Journal.

[4]  Volker Hohmann,et al.  Database of Multichannel In-Ear and Behind-the-Ear Head-Related and Binaural Room Impulse Responses , 2009, EURASIP J. Adv. Signal Process..

[5]  Raffaele Grasso,et al.  Single-snapshot DOA estimation by using Compressed Sensing , 2014, EURASIP Journal on Advances in Signal Processing.

[6]  Fakheredine Keyrouz,et al.  Advanced Binaural Sound Localization in 3-D for Humanoid Robots , 2014, IEEE Transactions on Instrumentation and Measurement.

[7]  Peter Gerstoft,et al.  Multi-frequency sparse Bayesian learning for robust matched field processing. , 2017, The Journal of the Acoustical Society of America.

[8]  Peter Gerstoft,et al.  Sparse Bayesian learning with multiple dictionaries , 2017, Signal Process..

[9]  David P. Wipf,et al.  Beamforming using the relevance vector machine , 2007, ICML '07.

[10]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[11]  Jérôme Antoni,et al.  Bayesian space-frequency separation of wide-band sound sources by a hierarchical approach. , 2012, The Journal of the Acoustical Society of America.

[12]  Zhengyou Zhang,et al.  Maximum Likelihood Sound Source Localization and Beamforming for Directional Microphone Arrays in Distributed Meetings , 2008, IEEE Transactions on Multimedia.

[13]  Jesper Jensen,et al.  Informed Sound Source Localization Using Relative Transfer Functions for Hearing Aid Applications , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[14]  Aggelos K. Katsaggelos,et al.  Bayesian Compressive Sensing Using Laplace Priors , 2010, IEEE Transactions on Image Processing.

[15]  J. Antoni,et al.  Empirical Bayesian regularization of the inverse acoustic problem , 2015 .

[16]  P. Gerstoft,et al.  A sparse equivalent source method for near-field acoustic holography. , 2017, The Journal of the Acoustical Society of America.

[17]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Søren Holdt Jensen,et al.  Statistically efficient methods for pitch and DOA estimation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Rémi Gribonval,et al.  Near-field acoustic holography using sparse regularization and compressive sampling principles. , 2012, The Journal of the Acoustical Society of America.

[20]  Christoph F. Mecklenbräuker,et al.  Multisnapshot Sparse Bayesian Learning for DOA , 2016, IEEE Signal Processing Letters.

[21]  Geert Leus,et al.  Aliasing-Free Wideband Beamforming Using Sparse Signal Representation , 2011, IEEE Transactions on Signal Processing.

[22]  Yiyu Zhou,et al.  An Efficient Maximum Likelihood Method for Direction-of-Arrival Estimation via Sparse Bayesian Learning , 2012, IEEE Transactions on Wireless Communications.

[23]  S. Nadarajah A generalized normal distribution , 2005 .

[24]  P. Gerstoft,et al.  Compressive beamforming. , 2014, The Journal of the Acoustical Society of America.