论文信息 - Real-Time Speech Enhancement with GCC-NMF: Demonstration on the Raspberry Pi and NVIDIA Jetson

Real-Time Speech Enhancement with GCC-NMF: Demonstration on the Raspberry Pi and NVIDIA Jetson

We demonstrate a real-time, open source implementation of the online GCC-NMF stereo speech enhancement algorithm. While the system runs on a variety of operating systems and hardware platforms, we highlight its potential for real-world mobile use by presenting it on two embedded systems: the Raspberry Pi 3 and the NVIDIA Jetson TX1. The effect of various algorithm parameters on subjective enhancement quality may be explored interactively via a graphical user interface, with the results heard in real-time. The trade-off between interference suppression and target fidelity is controlled by manipulating the parameters of the coefficient masking function. Increasing the pre-learned dictionary size improves overall speech enhancement quality at increased computational cost. We show that real-time GCC-NMF has potential for real-world application, remaining purely unsupervised and retaining the simplicity and flexibility of offline GCC-NMF.

Jean Rouat | Sean U. N. Wood

[1] Jean Rouat,et al. Real-Time Speech Enhancement with GCC-NMF , 2017, INTERSPEECH.

[2] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[3] Antoine Liutkus,et al. The 2016 Signal Separation Evaluation Campaign , 2017, LVA/ICA.

[4] Jean Rouat,et al. Blind Speech Separation and Enhancement With GCC-NMF , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5] Jon Barker,et al. An analysis of environment, microphone and data simulation mismatches in robust speech recognition , 2017, Comput. Speech Lang..

[6] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .