A postfiltering approach for dual-microphone smartphones

Although beamforming is a powerful tool for microphone array speech enhancement, its performance with small arrays, such as the case of a dual-microphone smartphone, is quite limited. The goal of this paper is to study different postfiltering approaches that allow for further noise reduction. These postfilters are applied to our previously proposed extended Kalman filter framework for relative transfer function estimation in the context of minimum variance distortionless response beamforming. We study two different postfilters based on Wiener filtering and non-linear estimation of the speech amplitude. We also propose several estimators of the clean speech power spectral density which exploit the speaker position with respect to the device. The proposals are evaluated when applying speech enhancement on a dual-microphone smartphone in different noisy acoustic environments, in terms of both perceptual quality and speech intelligibility. Experimental results show that our proposals achieve further noise reduction in comparison with other related approaches from the literature.

[1]  Thomas Esch,et al.  Efficient musical noise suppression for speech enhancement system , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Petros Maragos,et al.  A generalized estimation approach for linear and nonlinear microphone array post-filters , 2007, Speech Commun..

[3]  Jacob Benesty,et al.  An Integrated Solution for Online Multichannel Noise Tracking and Reduction , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Christophe Beaugeant,et al.  Noise reduction for dual-microphone mobile phones exploiting power level differences , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Bhiksha Raj,et al.  Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors , 2012, IEEE Signal Processing Magazine.

[6]  Christophe Beaugeant,et al.  Dual microphone noise PSD estimation for mobile phones in hands-free position exploiting the coherence and speech presence probability , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Ángel M. Gómez,et al.  Dual-channel spectral weighting for robust speech recognition in mobile devices , 2018, Digit. Signal Process..

[8]  Ángel M. Gómez,et al.  An Extended Kalman Filter for RTF Estimation in Dual-Microphone Smartphones , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[9]  Yannick Mahieux,et al.  Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering , 1998, IEEE Trans. Speech Audio Process..

[10]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Weiping Zhu,et al.  Recent Developments in Speech Enhancement in the Short-Time Fourier Transform Domain , 2016, IEEE Circuits and Systems Magazine.

[12]  R. Zelinski,et al.  A microphone array with adaptive post-filtering for noise reduction in reverberant rooms , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[13]  S. Gannot,et al.  Speech enhancement based on the general transfer function GSC and postfiltering , 2004, IEEE Trans. Speech Audio Process..

[14]  Emmanuel Vincent,et al.  A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15]  Alex Acero,et al.  Sound capture system and spatial filter for small devices , 2008, INTERSPEECH.

[16]  Xiaodong Li,et al.  A Statistical Analysis of Two-Channel Post-Filter Estimators in Isotropic Noise Fields , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Mingjiang Wang,et al.  Speech enhancement for nonstationary noise environments , 2017, 2017 IEEE 17th International Conference on Communication Technology (ICCT).

[18]  Hervé Bourlard,et al.  Microphone array post-filter based on noise field coherence , 2003, IEEE Trans. Speech Audio Process..

[19]  Wei Xiao,et al.  Multi-channel noise reduction for hands-free voice communication on mobile phones , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Junichi Yamagishi,et al.  SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2016 .