Real Time Speech Enhancement in the Waveform Domain
暂无分享,去创建一个
Gabriel Synnaeve | Yossi Adi | Alexandre Defossez | Gabriel Synnaeve | Alexandre Défossez | Yossi Adi
[1] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[2] Yi Hu,et al. Subjective Comparison of Speech Enhancement Algorithms , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[3] Ke Wang,et al. Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition , 2018, INTERSPEECH.
[4] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[6] Sebastian Braun,et al. Weighted Speech Distortion Losses for Neural-Network-Based Real-Time Speech Enhancement , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Kuldip K. Paliwal,et al. Deep Residual-Dense Lattice Network for Speech Enhancement , 2020, AAAI.
[8] John H. L. Hansen,et al. Babble Noise: Modeling, Analysis, and Applications , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[9] DeLiang Wang,et al. A deep neural network for time-domain signal reconstruction , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Kuldip K. Paliwal,et al. Deep learning for minimum mean-square error approaches to speech enhancement , 2019, Speech Commun..
[11] Edouard Grave,et al. End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures , 2019, ArXiv.
[12] Xavier Serra,et al. A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Deepak Baby,et al. Sergan: Speech Enhancement Using Relativistic Generative Adversarial Networks with Gradient Penalty , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[15] Johannes Gehrke,et al. The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework , 2020, ArXiv.
[16] Yi Hu,et al. Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Vladlen Koltun,et al. Speech Denoising with Deep Feature Losses , 2018, INTERSPEECH.
[18] Jun Du,et al. Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement , 2017, INTERSPEECH.
[19] Reinhold Häb-Umbach,et al. An Investigation into the Effectiveness of Enhancement in ASR Training and Test for Chime-5 Dinner Party Transcription , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[20] Eunwoo Song,et al. Probability density distillation with generative adversarial networks for high-quality parallel waveform generation , 2019, INTERSPEECH.
[21] Björn W. Schuller,et al. Discriminatively trained recurrent neural networks for single-channel speech separation , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[22] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[23] Johannes Gehrke,et al. A scalable noisy speech dataset and online subjective test framework , 2019, INTERSPEECH.
[24] Issa M. S. Panahi,et al. An Individualized Super-Gaussian Single Microphone Speech Enhancement for Hearing Aid Users With Smartphone as an Assistive Device , 2017, IEEE Signal Processing Letters.
[25] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[26] Ryuichi Yamamoto,et al. Parallel Wavegan: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] DeLiang Wang,et al. A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement , 2018, INTERSPEECH.
[28] A.V. Oppenheim,et al. Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.
[30] Yossi Adi,et al. Voice Separation with an Unknown Number of Multiple Speakers , 2020, ICML.
[31] Cassia Valentini-Botinhao,et al. Noisy speech database for training speech enhancement algorithms and TTS models , 2017 .
[32] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[33] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[34] Francis Bach,et al. Music Source Separation in the Waveform Domain , 2019, ArXiv.
[35] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[36] Maarten De Vos,et al. Improving GANs for Speech Enhancement , 2020, IEEE Signal Processing Letters.
[37] Cha Zhang,et al. CROWDMOS: An approach for crowdsourcing mean opinion score studies , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] Shou-De Lin,et al. MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement , 2019, ICML.
[39] Tillman Weyde,et al. Improved Speech Enhancement with the Wave-U-Net , 2018, ArXiv.
[40] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[41] Hemant A. Patil,et al. Time-Frequency Masking-Based Speech Enhancement Using Generative Adversarial Network , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[43] Kuldip K. Paliwal,et al. DeepMMSE: A Deep Learning Approach to MMSE-Based Noise Power Spectral Density Estimation , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[44] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[45] Julius O. Smith,et al. A flexible sampling-rate conversion method , 1984, ICASSP.