暂无分享,去创建一个
Tomoki Toda | Shinji Watanabe | Hung-Yi Lee | Tomoki Hayashi | Wen-Chin Huang | Shu-Wen Yang | Hung-yi Lee | T. Toda | Shinji Watanabe | Tomoki Hayashi | Shu-wen Yang | Wen-Chin Huang | Shu-Wen Yang
[1] Li-Rong Dai,et al. WaveNet Vocoder with Limited Training Data for Voice Conversion , 2018, INTERSPEECH.
[2] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Hao Wang,et al. Phonetic posteriorgrams for many-to-one voice conversion without parallel data training , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[5] Junichi Yamagishi,et al. Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion , 2020, Blizzard Challenge / Voice Conversion Challenge.
[6] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[7] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[8] Lirong Dai,et al. Non-Parallel Voice Conversion with Autoregressive Conversion Model and Duration Adjustment , 2020, Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020.
[9] Tomoki Toda,et al. Any-to-One Sequence-to-Sequence Voice Conversion Using Self-Supervised Discrete Speech Representations , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Eugene Kharitonov,et al. Speech Resynthesis from Discrete Disentangled Self-Supervised Representations , 2021, Interspeech.
[11] Subjective evaluation of speech quality with a crowdsourcing approach Summary , 2022 .
[12] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Junichi Yamagishi,et al. Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions , 2020, Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020.
[15] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[16] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[17] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[18] Babak Naderi,et al. An Open source Implementation of ITU-T Recommendation P.808 with Validation , 2020, INTERSPEECH.
[19] Junichi Yamagishi,et al. An autoregressive recurrent mixture density network for parametric speech synthesis , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Shinji Watanabe,et al. SUPERB: Speech processing Universal PERformance Benchmark , 2021, Interspeech.
[21] Tomoki Toda,et al. On Prosody Modeling for ASR+TTS Based Voice Conversion , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[22] Hung-yi Lee,et al. Fragmentvc: Any-To-Any Voice Conversion by End-To-End Extracting and Fusing Fine-Grained Voice Fragments with Attention , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] J. Tao,et al. CASIA Voice Conversion System for the Voice Conversion Challenge 2020 , 2020, Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020.
[24] Meng Li,et al. Exploring wav2vec 2.0 on speaker verification and language identification , 2020, Interspeech.
[25] Hung-yi Lee,et al. S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations , 2021, Interspeech 2021.
[26] Jaehyeon Kim,et al. HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis , 2020, NeurIPS.
[27] Yuan Jiang,et al. Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer , 2020, Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020.
[28] Tomoki Toda,et al. The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS , 2020, Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020.
[29] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2017 .