Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams
暂无分享,去创建一个
Xunying Liu | Songxiang Liu | Yuewen Cao | Xixin Wu | Lifa Sun | Helen Meng | H. Meng | Xunying Liu | Xixin Wu | Songxiang Liu | Yuewen Cao | Lifa Sun
[1] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[2] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[3] Junichi Yamagishi,et al. The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods , 2018, Odyssey.
[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[5] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[6] Kishore Prahallad,et al. Spectral Mapping Using Artificial Neural Networks for Voice Conversion , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[7] Tetsuya Takiguchi,et al. Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Yonghong Yan,et al. High Quality Voice Conversion through Phoneme-Based Linear Mapping Functions with STRAIGHT for Mandarin , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).
[9] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.
[11] Xunying Liu,et al. Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance , 2018, INTERSPEECH.
[12] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Kun Li,et al. Voice conversion using deep Bidirectional Long Short-Term Memory based Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Hao Wang,et al. Phonetic posteriorgrams for many-to-one voice conversion without parallel data training , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[15] Hui Lu,et al. A Compact Framework for Voice Conversion Using Wavenet Conditioned on Phonetic Posteriorgrams , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Xunying Liu,et al. The HCCL-CUHK System for the Voice Conversion Challenge 2018 , 2018, Odyssey.
[17] Li-Rong Dai,et al. WaveNet Vocoder with Limited Training Data for Voice Conversion , 2018, INTERSPEECH.
[18] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[19] Bo Chen,et al. High-quality Voice Conversion Using Spectrogram-Based WaveNet Vocoder , 2018, INTERSPEECH.
[20] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[21] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[22] Seyed Hamidreza Mohammadi,et al. Voice conversion using deep neural networks with speaker-independent pre-training , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).