FPGA multi-filter system for speech enhancement via multi-criteria optimization

Speech is the main medium for human communication and interaction. Apart from the traditional telephones, more and more applications come with speech interfaces, which use speech signal as an input for various purposes. However, many of these applications might fail to perform in noisy environments as the signal-to-noise ratio (SNR) degrades. Two important measures for any speech enhancement algorithm are noise suppression and speech distortion. Naturally, different speech enhancement algorithms will have different trade-offs. Moreover, depending on the environment, it is possible that one algorithm will outperform the others in some respects. This paper proposes a multi-filter system, which has the capability of continually adjusting the noise suppression level and the speech distortion level in a Pareto fashion. Moreover, we show that the system works under a variety of noisy environments and we obtain the efficient frontier of the combined filters for each background noise. Because the multi-filters are adapting in parallel, the final system can be implemented on FPGA efficiently.

[1]  Patricia Melin,et al.  Particle swarm optimization of interval type-2 fuzzy systems for FPGA applications , 2013, Appl. Soft Comput..

[2]  Sven Nordholm,et al.  Space constrained beamforming with source PSD updates , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[4]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[5]  Yariv Ephraim,et al.  A linear predictive front-end processor for speech recognition in noisy environments , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Kok Lay Teo,et al.  A new design method for broadband microphone arrays for speech input in automobiles , 2002, IEEE Signal Processing Letters.

[7]  Sven Nordholm,et al.  A Hybrid Descent Method for Optimal Sigmoid Filter Design , 2014, IEEE Signal Processing Letters.

[8]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[9]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[10]  Sven Nordholm,et al.  On the optimization of sigmoid function for speech enhancement , 2011, 2011 19th European Signal Processing Conference.

[11]  Oscar Castillo,et al.  Embedding a high speed interval type-2 fuzzy controller for a real plant into an FPGA , 2012, Appl. Soft Comput..

[12]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[13]  Sven Nordholm,et al.  A multi-filter system for speech enhancement under low signal-to-noise ratios , 2009 .

[14]  Sven Nordholm,et al.  A hybrid design of beamformers for voice control devices , 2012 .

[15]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[16]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[17]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[18]  Oscar Castillo,et al.  Conjunction and disjunction operations for digital fuzzy hardware , 2013, Appl. Soft Comput..

[19]  Tharam S. Dillon,et al.  Enhancement of Speech Recognitions for Control Automation Using an Intelligent Particle Swarm Optimization , 2012, IEEE Transactions on Industrial Informatics.

[20]  Israel Cohen,et al.  Speech enhancement using a noncausal a priori SNR estimator , 2004, IEEE Signal Processing Letters.

[21]  Oscar Castillo,et al.  Particle Swarm Optimization for Average Approximation of Interval Type-2 Fuzzy Inference Systems Design in FPGAs for Real Applications , 2013, Recent Advances on Hybrid Intelligent Systems.

[22]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[23]  Wayne Luk,et al.  Synthesis of saturation arithmetic architectures , 2003, TODE.

[24]  Sven Nordholm,et al.  Spatio-temporal processing for distant speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  Sven Nordholm,et al.  Use of efficient frontier in microphone arrays , 2006 .