One-Shot Voice Conversion by Vector Quantization
暂无分享,去创建一个
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Moncef Gabbouj,et al. Voice Conversion Using Partial Least Squares Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Yu Tsao,et al. Voice conversion from non-parallel corpora using variational auto-encoder , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).
[4] Yu Tsao,et al. Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks , 2017, INTERSPEECH.
[5] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[6] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[7] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[8] Hung-yi Lee,et al. One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization , 2019, INTERSPEECH.
[9] Kou Tanaka,et al. StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[10] Mark Hasegawa-Johnson,et al. Zero-Shot Voice Style Transfer with Only Autoencoder Loss , 2019, ICML.
[11] Ricardo Gutierrez-Osuna,et al. Group Latent Embedding for Vector Quantized Variational Autoencoder in Non-Parallel Voice Conversion , 2019, INTERSPEECH.
[12] Ron J. Weiss,et al. Unsupervised Speech Representation Learning Using WaveNet Autoencoders , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.
[14] Lin-Shan Lee,et al. Rhythm-Flexible Voice Conversion Without Parallel Data Using Cycle-GAN Over Phoneme Posteriorgram Sequences , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[15] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Serge J. Belongie,et al. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[17] Ali Razavi,et al. Generating Diverse High-Fidelity Images with VQ-VAE-2 , 2019, NeurIPS.