PEVD-Based Speech Enhancement in Reverberant Environments

The enhancement of noisy speech is important for applications involving human-to-human interactions, such as telecommunications and hearing aids, as well as human-to-machine interactions, such as voice-controlled systems and robot audition. In this work, we focus on reverberant environments. It is shown that, by exploiting the lack of correlation between speech and the late reflections, further noise reduction can be achieved. This is verified using simulations involving actual acoustic impulse responses and noise from the ACE corpus. The simulations show that even without using a noise estimator, our proposed method simultaneously achieves noise reduction, and enhancement of speech quality and intelligibility, in reverberant environments over a wide range of SNRs. Furthermore, informal listening examples highlight that our approach does not introduce any significant processing artefacts such as musical noise.

[1]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[2]  Stephan D. Weiss,et al.  MVDR broadband beamforming using polynomial matrix techniques , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[3]  Alastair H. Moore,et al.  Estimation of Room Acoustic Parameters: The ACE Challenge , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4]  Yi Hu,et al.  Evaluation of objective measures for speech enhancement , 2006, INTERSPEECH.

[5]  Patrick A. Naylor,et al.  Second Order Sequential Best Rotation Algorithm with Householder Reduction for Polynomial Matrix Eigenvalue Decomposition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Yi Hu,et al.  A subspace approach for enhancing speech corrupted by colored noise , 2002, IEEE Signal Processing Letters.

[7]  Nobuhiko Kitawaki,et al.  Combined approach of array processing and independent component analysis for blind separation of acoustic signals , 2003, IEEE Trans. Speech Audio Process..

[8]  Satoshi Nakamura,et al.  Speech enhancement based on the subspace method , 2000, IEEE Trans. Speech Audio Process..

[9]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[10]  S. Weiss,et al.  Identification of Broadband Source-Array Responses from Sensor Second Order Statistics , 2017, 2017 Sensor Signal Processing for Defence Conference (SSPD).

[11]  Jacob Benesty,et al.  A brief overview of speech enhancement with linear filtering , 2014, EURASIP J. Adv. Signal Process..

[12]  Marc Moonen,et al.  GSVD-based optimal filtering for single and multimicrophone speech enhancement , 2002, IEEE Trans. Signal Process..

[13]  John G. McWhirter,et al.  Multiple shift second order sequential best rotation algorithm for polynomial matrix EVD , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[14]  Søren Holdt Jensen,et al.  Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[16]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Jacob Benesty,et al.  New insights into the noise reduction Wiener filter , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Jacob Benesty,et al.  Analysis and Comparison of Multichannel Noise Reduction Methods in a Common Framework , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Patrick A. Naylor,et al.  Speech Enhancement Using Polynomial Eigenvalue Decomposition , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[20]  Robert Bregovic,et al.  Multirate Systems and Filter Banks , 2002 .

[21]  Thomas Esch,et al.  Efficient musical noise suppression for speech enhancement system , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  John G. McWhirter,et al.  Multiple shift maximum element sequential matrix diagonalisation for parahermitian matrices , 2014, 2014 IEEE Workshop on Statistical Signal Processing (SSP).

[23]  Israel Cohen,et al.  Multichannel Eigenspace Beamforming in a Reverberant Noisy Environment With Multiple Interfering Speech Signals , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[24]  Hugo Van hamme,et al.  A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition , 2007, EURASIP J. Adv. Signal Process..

[25]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[26]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[27]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[28]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[29]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[30]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[31]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[32]  Joerg Bitzer,et al.  Post-Filtering Techniques , 2001, Microphone Arrays.

[33]  John G. McWhirter,et al.  An approximate polynomial matrix eigenvalue decomposition algorithm for para-Hermitian matrices , 2011, 2011 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[34]  Benoît Champagne,et al.  A multi-microphone signal subspace approach for speech enhancement , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[35]  John G. McWhirter,et al.  Relevance of polynomial matrix decompositions to broadband blind signal separation , 2017, Signal Process..

[36]  John G. McWhirter,et al.  An EVD Algorithm for Para-Hermitian Polynomial Matrices , 2007, IEEE Transactions on Signal Processing.

[37]  Patrick A. Naylor,et al.  Acoustic SLAM , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.