Time Domain Adversarial Voice Conversion for ADD 2022
暂无分享,去创建一个
Xiangang Li | Wei Zou | Tingwei Guo | Shuran Zhou | Chuandong Xie | Rui Yan | Cheng Wen | Xi Tan
[1] Konstantin Böttinger,et al. Human Perception of Audio Deepfakes , 2022, DDAM@MM.
[2] Haizhou Li,et al. ADD 2022: the first Audio Deep Synthesis Detection Challenge , 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Vandana P. Janeja,et al. How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey , 2021, ArXiv.
[4] Zhen-Hua Ling,et al. Adversarial Voice Conversion Against Neural Spoofing Detectors , 2021, Interspeech.
[5] Nima Mesgarani,et al. StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion , 2021, Interspeech.
[6] Manh Luong,et al. Many-to-Many Voice Conversion based Feature Disentanglement using Variational Autoencoder , 2021, Interspeech.
[7] Tao Qin,et al. A Survey on Neural Speech Synthesis , 2021, ArXiv.
[8] Tao Qin,et al. AdaSpeech: Adaptive Text to Speech for Custom Voice , 2021, ICLR.
[9] Bin Ma,et al. Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Hui Bu,et al. AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines , 2020, ArXiv.
[11] Kun Han,et al. Didispeech: A Large Scale Mandarin Speech Corpus , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2021 .
[13] Jaehyeon Kim,et al. HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis , 2020, NeurIPS.
[14] Yoshua Bengio,et al. MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis , 2019, NeurIPS.
[15] Tomoki Toda,et al. Non-Parallel Voice Conversion with Cyclic Variational Autoencoder , 2019, INTERSPEECH.
[16] Xu Tan,et al. FastSpeech: Fast, Robust and Controllable Text to Speech , 2019, NeurIPS.
[17] Kou Tanaka,et al. StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[18] Junichi Yamagishi,et al. High-Quality Nonparallel Voice Conversion Based on Cycle-Consistent Adversarial Network , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Sercan Ömer Arik,et al. Neural Voice Cloning with a Few Samples , 2018, NeurIPS.
[20] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Sercan Ömer Arik,et al. Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning , 2017, ICLR.
[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).