We demonstrate a real-time, open source implementation of the online GCC-NMF stereo speech enhancement algorithm. While the system runs on a variety of operating systems and hardware platforms, we highlight its potential for real-world mobile use by presenting it on two embedded systems: the Raspberry Pi 3 and the NVIDIA Jetson TX1. The effect of various algorithm parameters on subjective enhancement quality may be explored interactively via a graphical user interface, with the results heard in real-time. The trade-off between interference suppression and target fidelity is controlled by manipulating the parameters of the coefficient masking function. Increasing the pre-learned dictionary size improves overall speech enhancement quality at increased computational cost. We show that real-time GCC-NMF has potential for real-world application, remaining purely unsupervised and retaining the simplicity and flexibility of offline GCC-NMF.
[1]
Jean Rouat,et al.
Real-Time Speech Enhancement with GCC-NMF
,
2017,
INTERSPEECH.
[2]
H. Sebastian Seung,et al.
Algorithms for Non-negative Matrix Factorization
,
2000,
NIPS.
[3]
Antoine Liutkus,et al.
The 2016 Signal Separation Evaluation Campaign
,
2017,
LVA/ICA.
[4]
Jean Rouat,et al.
Blind Speech Separation and Enhancement With GCC-NMF
,
2017,
IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5]
Jon Barker,et al.
An analysis of environment, microphone and data simulation mismatches in robust speech recognition
,
2017,
Comput. Speech Lang..
[6]
G. Carter,et al.
The generalized correlation method for estimation of time delay
,
1976
.