暂无分享,去创建一个
Hung-yi Lee | Tsung-Han Wu | Yen-Hao Chen | Da-Yi Wu | Hung-yi Lee | Tsung-Han Wu | Yen-Hao Chen | Da-Yi Wu
[1] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92) , 2019 .
[2] Mark Hasegawa-Johnson,et al. Unsupervised Speech Decomposition via Triple Information Bottleneck , 2020, ICML.
[3] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[4] Mark Hasegawa-Johnson,et al. F0-Consistent Many-To-Many Non-Parallel Voice Conversion Via Conditional Autoencoder , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Yu Tsao,et al. Voice conversion from non-parallel corpora using variational auto-encoder , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).
[6] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[7] Hemant A. Patil,et al. Novel Adaptive Generative Adversarial Network for Voice Conversion , 2019, 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[8] Hirokazu Kameoka,et al. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks , 2017, ArXiv.
[9] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[10] Kou Tanaka,et al. Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Joan Serra,et al. Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion , 2019, NeurIPS.
[12] Yu Tsao,et al. Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders , 2018, 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP).
[13] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[14] Serge J. Belongie,et al. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[15] Kou Tanaka,et al. StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[16] Hirokazu Kameoka,et al. CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[17] Yu Tsao,et al. Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks , 2017, INTERSPEECH.
[18] Hung-yi Lee,et al. One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization , 2019, INTERSPEECH.
[19] Yoshua Bengio,et al. MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis , 2019, NeurIPS.
[20] Kou Tanaka,et al. ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[22] Lin-Shan Lee,et al. Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations , 2018, INTERSPEECH.
[23] Kou Tanaka,et al. StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion , 2019, INTERSPEECH.
[24] Mark Hasegawa-Johnson,et al. Zero-Shot Voice Style Transfer with Only Autoencoder Loss , 2019, ICML.
[25] Hung-yi Lee,et al. One-Shot Voice Conversion by Vector Quantization , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Hung-Yi Lee,et al. VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture , 2020, INTERSPEECH.
[27] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).