暂无分享,去创建一个
[1] Gautham J. Mysore,et al. VoCo , 2017, ACM Trans. Graph..
[2] Jaehyeon Kim,et al. HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis , 2020, NeurIPS.
[3] Zeyu Jin,et al. Acoustic Matching By Embedding Impulse Responses , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[5] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[6] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Zhong-Qiu Wang,et al. Deep Learning Based Target Cancellation for Speech Dereverberation , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Tie-Yan Liu,et al. Denoispeech: Denoising Text to Speech with Frame-Level Noise Modeling , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tao Qin,et al. FastSpeech 2: Fast and High-Quality End-to-End Text to Speech , 2021, ICLR.
[10] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92) , 2019 .
[11] Heiga Zen,et al. LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech , 2019, INTERSPEECH.
[12] Yu Tsao,et al. WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement , 2020, IEEE Signal Processing Letters.
[13] Peter Vary,et al. A binaural room impulse response database for the evaluation of dereverberation algorithms , 2009, 2009 16th International Conference on Digital Signal Processing.
[14] Tomohiro Nakatani,et al. The reverb challenge: A common evaluation framework for dereverberation and recognition of reverberant speech , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[15] Chengzhu Yu,et al. DurIAN: Duration Informed Attention Network for Speech Synthesis , 2020, INTERSPEECH.
[16] DeLiang Wang,et al. Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Yu Ting Yeung,et al. EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[18] Tan Lee,et al. CUHK-EE voice cloning system for ICASSP 2021 M2VoC challenge , 2021, ArXiv.
[19] Umut Isik,et al. Attention Wave-U-Net for Speech Enhancement , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[20] Tao Li,et al. Controllable Emotion Transfer For End-to-End Speech Synthesis , 2020, 2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP).
[21] James Glass,et al. Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Ladislav Mošner,et al. Building and Evaluation of a Real Room Impulse Response Dataset , 2018, IEEE Journal of Selected Topics in Signal Processing.
[23] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Lei Xie,et al. Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training , 2020, INTERSPEECH.