Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration
暂无分享,去创建一个
Wouter Tirry | Tim Fingscheidt | Bruno Defraene | Maximilian Strake | Kristoff Fluyt | T. Fingscheidt | Bruno Defraene | Wouter Tirry | Maximilian Strake | Kristoff Fluyt
[1] Y. Ephraim,et al. Speech enhancement using the multistage Wiener filter , 2009, 2009 43rd Annual Conference on Information Sciences and Systems.
[2] Tim Fingscheidt,et al. Quality assessment of speech enhancement systems by separation of enhanced speech, noise, and echo , 2007, INTERSPEECH.
[3] Richard C. Hendriks,et al. Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Naoya Takahashi,et al. Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).
[5] Wouter Tirry,et al. Instantaneous A Priori SNR Estimation by Cepstral Excitation Manipulation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[7] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..
[8] Jun Du,et al. Densely Connected Progressive Learning for LSTM-Based Speech Enhancement , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tim Fingscheidt,et al. Environment-Optimized Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[11] Tim Fingscheidt,et al. Convolutional Neural Networks to Enhance Coded Speech , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Hakan Erdogan,et al. Spectro-temporal post-enhancement using MMSE estimation in NMF based single-channel source separation , 2013, INTERSPEECH.
[13] Graham W. Taylor,et al. Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.
[14] Israel Cohen,et al. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..
[15] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Tim Fingscheidt,et al. Concatenated Identical DNN (CI-DNN) to Reduce Noise-Type Dependence in DNN-Based Speech Enhancement , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).
[17] Chin-Hui Lee,et al. Convolutional-Recurrent Neural Networks for Speech Enhancement , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Wouter Tirry,et al. Separated Noise Suppression and Speech Restoration: Lstm-Based Speech Enhancement in Two Stages , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[19] Rainer Martin,et al. A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[20] DeLiang Wang,et al. Complex Ratio Masking for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Mark D. Plumbley,et al. Two-Stage Single-Channel Audio Source Separation Using Deep Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[22] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[23] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[24] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[25] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.
[26] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[27] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.
[28] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.
[29] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[30] Francesco Visin,et al. A guide to convolution arithmetic for deep learning , 2016, ArXiv.
[31] Jonathan Le Roux,et al. Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Yu-Bin Yang,et al. Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections , 2016, NIPS.
[33] Philipos C. Loizou,et al. A noise-estimation algorithm for highly non-stationary environments , 2006, Speech Commun..
[34] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..
[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[36] DeLiang Wang,et al. A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement , 2018, INTERSPEECH.
[37] DeLiang Wang,et al. Towards Scaling Up Classification-Based Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[38] NTTアドバンステクノロジ株式会社. Super wide-band stereo speech database , 2005 .
[40] Yu Tsao,et al. Complex spectrogram enhancement by convolutional neural network with multi-metrics learning , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[41] Wouter Tirry,et al. DNN-Supported Speech Enhancement With Cepstral Estimation of Both Excitation and Envelope , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[42] DeLiang Wang,et al. Reconstruction techniques for improving the perceptual quality of binary masked speech. , 2014, The Journal of the Acoustical Society of America.
[43] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[44] Akihiko Sugiyama,et al. A directional noise suppressor with a specified beamwidth , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Seunghoon Hong,et al. Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[46] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[47] Jinwon Lee,et al. A Fully Convolutional Neural Network for Speech Enhancement , 2016, INTERSPEECH.
[48] Mohsen Rahmani,et al. Noise cross PSD estimation using phase information in diffuse noise field , 2009, Signal Process..
[49] Abdeldjalil Aïssa-El-Bey,et al. Robust Estimation of Non-Stationary Noise Power Spectrum for Speech Enhancement , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[50] Huajun Yu,et al. Post-Filter Optimization for Multichannel Automotive Speech Enhancement , 2013 .
[51] Pascal Scalart,et al. Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[52] Sridha Sridharan,et al. The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms , 2010, INTERSPEECH.
[53] Peter Vary,et al. Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model , 2005, EURASIP J. Adv. Signal Process..
[54] Björn W. Schuller,et al. Discriminatively trained recurrent neural networks for single-channel speech separation , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[55] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[56] DeLiang Wang,et al. Long short-term memory for speaker generalization in supervised speech separation. , 2017, The Journal of the Acoustical Society of America.
[57] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[58] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.