ARVC: An Auto-Regressive Voice Conversion System Without Parallel Training Data
暂无分享,去创建一个
[1] E. Owens,et al. An Introduction to the Psychology of Hearing , 1997 .
[2] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[3] Junichi Yamagishi,et al. The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods , 2018, Odyssey.
[4] Tetsuya Takiguchi,et al. Exemplar-Based Voice Conversion Using Sparse Representation in Noisy Environments , 2013, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..
[5] Kou Tanaka,et al. StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion , 2019, INTERSPEECH.
[6] Chengzhu Yu,et al. Pitchnet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[8] Li-Rong Dai,et al. WaveNet Vocoder with Limited Training Data for Voice Conversion , 2018, INTERSPEECH.
[9] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[10] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[11] Kou Tanaka,et al. StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[12] Frank K. Soong,et al. Voice conversion with SI-DNN and KL divergence based mapping without parallel training data , 2019, Speech Commun..
[13] Yu Tsao,et al. Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders , 2018, 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP).
[14] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[15] Marc Schröder,et al. Evaluation of Expressive Speech Synthesis With Voice Conversion and Copy Resynthesis Techniques , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Kou Tanaka,et al. Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[18] Hao Wang,et al. Phonetic posteriorgrams for many-to-one voice conversion without parallel data training , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[19] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[20] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[21] Hirokazu Kameoka,et al. CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[22] John-Paul Hosom,et al. Improving the intelligibility of dysarthric speech , 2007, Speech Commun..
[23] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[24] Haizhou Li,et al. Cross-lingual Voice Conversion with Bilingual Phonetic Posteriorgram and Average Modeling , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Yoshua Bengio,et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.
[26] Yu Tsao,et al. Voice conversion from non-parallel corpora using variational auto-encoder , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).
[27] Xiao Chen,et al. Voice Conversion with Transformer Network , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Fadi Biadsy,et al. Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation , 2019, INTERSPEECH.
[29] Dorien Herremans,et al. Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[31] Chng Eng Siong,et al. A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data , 2019, INTERSPEECH.
[32] Kishore Prahallad,et al. Spectral Mapping Using Artificial Neural Networks for Voice Conversion , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[33] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[34] Jan Skoglund,et al. LPCNET: Improving Neural Speech Synthesis through Linear Prediction , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).