论文信息 - Speech Enhancement Using Heterogeneous Information

Speech Enhancement Using Heterogeneous Information

Thisarticledescribeshowtouseheterogeneousinformationinspeechenhancement.Inmostofthe currentspeechenhancementsystems,cleanspeechesarerecoveredonlyfromthesignalscollectedby acousticmicrophones,whichwillbegreatlyaffectedbytheacousticnoises.However,heterogeneous informationfromdifferentkindsofsensors,whichisusuallycalledthe“multi-stream,”areseldom usedinspeechenhancementbecausethespeechwaveformscannotberecoveredfromthesignals providedbymanykindsofsensors.Inthisarticle,theauthorsproposeanewmodel-basedmultistreamspeechenhancementframeworkthatcanmakeuseoftheheterogeneousinformationprovided bythesignalsfromdifferentkindsofsensorsevenwhensomeofthemarenotdirectlyrelatedtothe speechwaveform.Thenanewspeechenhancementschemeusingtheacousticandthroatmicrophone recordingsisalsoproposedbasedonthenewspeechenhancementframework.Experimentalresults showthattheproposedschemeoutperformsseveralsingle-streamspeechenhancementmethodsin differentnoisyenvironments. KEywoRdS Heterogeneous Information, Model-Based, Multi-Stream, Speech Enhancement, Throat Microphone

Jun Zhang | Qiang Chen | Fang Xu | Yan Xiong

[1] Alan V. Oppenheim,et al. All-pole modeling of degraded speech , 1978 .

[2] Svetha Venkatesh,et al. Compressive speech enhancement , 2013, Speech Commun..

[3] Mark J. F. Gales. Predictive model-based compensation schemes for robust speech recognition , 1998, Speech Commun..

[4] Tom J. Moir,et al. Speech enhancement using Maximum A-Posteriori and Gaussian Mixture Models for speech and noise Periodogram estimation , 2016, Comput. Speech Lang..

[5] Rainer Martin,et al. Speech enhancement based on minimum mean-square error estimation and supergaussian priors , 2005, IEEE Transactions on Speech and Audio Processing.

[6] Jean-Philippe Thiran,et al. On Dynamic Stream Weighting for Audio-Visual Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Hadi Veisi,et al. Speech enhancement using hidden Markov models in Mel-frequency domain , 2013, Speech Commun..

[8] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[9] Yariv Ephraim,et al. A Bayesian estimation approach for speech enhancement using hidden Markov models , 1992, IEEE Trans. Signal Process..

[10] Bo Hong,et al. Multi-source streaming-based data accesses for MapReduce systems , 2014, Int. J. Big Data Intell..

[11] George Carayannis,et al. Speech enhancement from noise: A regenerative approach , 1991, Speech Commun..

[12] W. Bastiaan Kleijn,et al. Sparse HMM-based speech enhancement method for stationary and non-stationary noise environments , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13] Sridhar Krishna Nemala,et al. A Multistream Feature Framework Based on Bandpass Modulation Filtering for Robust Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[14] Pavel Smrz,et al. Heterogeneity-aware scheduler for stream processing frameworks , 2015, Int. J. Big Data Intell..

[15] Engin Erzin,et al. Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[16] H. Franco,et al. Combining standard and throat microphones for robust speech recognition , 2003, IEEE Signal Processing Letters.

[17] Noureddine Ellouze,et al. Speech enhancement based on wavelet packet of an improved principal component analysis , 2016, Comput. Speech Lang..

[18] Masato Abe,et al. Fast Implementation of KLT-Based Speech Enhancement Using Vector Quantization , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[19] Alok Jain,et al. Speech enhancement by noise driven adaptation of perceptual scales and thresholds of continuous wavelet transform coefficients , 2015, Speech Commun..