Universal Speech Enhancement with Score-based Diffusion
暂无分享,去创建一个
[1] Timo Gerkmann,et al. Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain , 2022, INTERSPEECH.
[2] Alexander Richard,et al. Conditional Diffusion Probabilistic Model for Speech Enhancement , 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] A. Finkelstein,et al. HiFi-GAN-2: Studio-Quality Speech Enhancement via Generative Adversarial Networks Conditioned on Acoustic Features , 2021, 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[4] DeLiang Wang,et al. VoiceFixer: Toward General Speech Restoration With Neural Vocoder , 2021, ArXiv.
[5] Hui Wang,et al. A Two-stage Complex Network using Cycle-consistent Generative Adversarial Networks for Speech Enhancement , 2021, Speech Commun..
[6] Eesung Kim,et al. SE-Conformer: Time-Domain Speech Enhancement Using Conformer , 2021, Interspeech.
[7] Yu Tsao,et al. A Study on Speech Enhancement Based on Diffusion Probabilistic Model , 2021, 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[8] Zhou Zhao,et al. WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution , 2021, Interspeech.
[9] Lei Xie,et al. DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement , 2021, Interspeech.
[10] Gaetan Hadjeres,et al. CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis , 2021, ISMIR.
[11] Arun Asokan Nair,et al. Cascaded Time + Time-Frequency Unet For Speech Enhancement: Jointly Addressing Clipping, Codec Distortions, And Gaps , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Bernd Edler,et al. A Flow-Based Neural Network for Time Domain Speech Enhancement , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Tasnima Sadekova,et al. Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech , 2021, ICML.
[14] Visar Berisha,et al. Restoring degraded speech via a modified diffusion model , 2021, Interspeech.
[15] M. Ravanelli,et al. MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement , 2021, Interspeech.
[16] Joan Serra,et al. On tuning consistent annealed sampling for denoising score matching , 2021, ArXiv.
[17] Seungu Han,et al. NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling , 2021, Interspeech.
[18] Nam Soo Kim,et al. Diff-TTS: A Denoising Diffusion Model for Text-to-Speech , 2021, Interspeech.
[19] Andrew Hines,et al. Warp-Q: Quality Prediction for Generative Neural Speech Codecs , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Lior Wolf,et al. High Fidelity Speech Regeneration with Application to Speech Enhancement , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Xiulian Peng,et al. Interactive Speech and Noise Modeling for Speech Enhancement , 2020, AAAI.
[22] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.
[23] Radu Horaud,et al. Fullsubnet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Joan Serra,et al. Upsampling Artifacts in Neural Audio Synthesis , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Saurabh Kataria,et al. Perceptual Loss Based Speech Denoising with an Ensemble of Audio Pattern Recognition and Self-Supervised Models , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Oleg Rybakov,et al. Real-Time Speech Frequency Bandwidth Extension , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Santiago Pascual,et al. SESQA: Semi-Supervised Learning for Speech Quality Assessment , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Bryan Catanzaro,et al. DiffWave: A Versatile Diffusion Model for Audio Synthesis , 2020, ICLR.
[29] Ioannis Mitliagkas,et al. Adversarial score matching and improved sampling for image generation , 2020, ICLR.
[30] Heiga Zen,et al. WaveGrad: Estimating Gradients for Waveform Generation , 2020, ICLR.
[31] Jean-Marc Valin,et al. PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss , 2020, INTERSPEECH.
[32] Gabriel Synnaeve,et al. Real Time Speech Enhancement in the Waveform Domain , 2020, INTERSPEECH.
[33] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[34] Stefano Ermon,et al. Improved Techniques for Training Score-Based Generative Models , 2020, NeurIPS.
[35] Chandan K. A. Reddy,et al. The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results , 2020, INTERSPEECH.
[36] Cong Zhou,et al. Source Coding of Audio Signals with a Generative Model , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Maarten De Vos,et al. Improving GANs for Speech Enhancement , 2020, IEEE Signal Processing Letters.
[38] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[39] Michael I. Mandel,et al. Speaker Independence of Neural Vocoders and Their Effect on Parametric Resynthesis Speech Enhancement , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[40] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92) , 2019 .
[41] Ryuichi Yamamoto,et al. Parallel Wavegan: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Emanuël A. P. Habets,et al. Declipping Speech Using Deep Filtering , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[43] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.
[44] Michael I. Mandel,et al. Parametric Resynthesis With Neural Vocoders , 2019, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[45] Adam Finkelstein,et al. Perceptually-motivated Environment-specific Speech Enhancement , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[46] Arno Solin,et al. Applied Stochastic Differential Equations , 2019 .
[47] Antonio Bonafonte,et al. Towards Generalized Speech Enhancement with Generative Adversarial Networks , 2019, INTERSPEECH.
[48] Daniel P. W. Ellis,et al. Learning Sound Event Classifiers from Web Audio with Noisy Labels , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Jong Wook Kim,et al. Crepe: A Convolutional Representation for Pitch Estimation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] Aaron C. Courville,et al. FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.
[51] Cassia Valentini-Botinhao,et al. Noisy speech database for training speech enhancement algorithms and TTS models , 2017 .
[52] Xavier Serra,et al. A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[53] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[54] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.
[55] Simon King,et al. Repeated Harvard Sentence Prompts corpus version 0.5 , 2014 .
[56] Pascal Vincent,et al. A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.
[57] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[58] Aapo Hyvärinen,et al. Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..
[59] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[60] B. Anderson. Reverse-time diffusion equation models , 1982 .
[61] Chengshi Zheng,et al. Two Heads are Better Than One: A Two-Stage Complex Spectral Mapping Approach for Monaural Speech Enhancement , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[62] Eric T. Nalisnick,et al. Under review as a conference paper at ICLR 2016 , 2015 .
[63] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[64] S. Srihari. Mixture Density Networks , 1994 .
[65] Jae S. Lim,et al. Speech enhancement , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[66] A. N. Kolmogorov,et al. Interpolation and extrapolation of stationary random sequences. , 1962 .
[67] Sidheswar Routray,et al. Phase sensitive masking-based single channel speech enhancement using conditional generative adversarial network , 2022, Comput. Speech Lang..