Joint Estimation Of Acoustic Parameters From Single-Microphone Speech Observations

Key parameters that characterise the degree of degradation in an acoustic environment are reverberation time (RT), direct-to-reverberant ratio (DRR) and signal-to-noise ratio (SNR). To address the inherent interplay that exists between these parameters, which can hinder existing methods designed to estimate only a single parameter, we propose a data-driven solution to jointly estimate all three parameters using a convolutional neural network. To facilitate robustness to unseen acoustic conditions, the method is trained using a large set of simulated acoustic impulse responses that have been carefully selected so as to mitigate interplays that exist between RT and DRR. In this work we evaluate the performance of the proposed estimator with respect to reverberation only. Results show the estimator compares favourably with respect to the state-of-the-art for unseen and real acoustic scenarios.

[1]  R. Maas,et al.  A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research , 2016, EURASIP Journal on Advances in Signal Processing.

[2]  Jing Xia,et al.  Effects of reverberation and noise on speech intelligibility in normal-hearing and aided hearing-impaired listeners. , 2018, The Journal of the Acoustical Society of America.

[3]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[4]  Sharon Gannot,et al.  Speech Dereverberation Using Fully Convolutional Networks , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[5]  W. Marsden I and J , 2012 .

[6]  P. Peterson Simulating the response of multiple microphones to a single acoustic source in a reverberant room. , 1986, The Journal of the Acoustical Society of America.

[7]  Patrick A. Naylor,et al.  Speech Dereverberation , 2010 .

[8]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[10]  Birger Kollmeier,et al.  Joint Estimation of Reverberation Time and Early-To-Late Reverberation Ratio From Single-Channel Speech Signals , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[11]  Mike Brookes,et al.  Performance Comparison of Algorithms for Blind Reverberation Time Estimation from Speech , 2012, IWAENC.

[12]  Alastair H. Moore,et al.  Estimation of Room Acoustic Parameters: The ACE Challenge , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  H. Sabine Room Acoustics , 1953, The SAGE Encyclopedia of Human Communication Sciences and Disorders.

[14]  Patrick A. Naylor,et al.  Acoustic Signal Processing in Noise: It's Not Getting Any Quieter , 2012, IWAENC.

[15]  Patrick A. Naylor,et al.  Evaluating the Non-Intrusive Room Acoustics Algorithm with the ACE Challenge , 2015, ArXiv.

[16]  Stefan Goetze,et al.  Joint Estimation of Reverberation Time and Direct-to-Reverberation Ratio from Speech using Auditory-Inspired Features , 2015, ArXiv.