End-To-End Accent Conversion Without Using Native Utterances
暂无分享,去创建一个
Songxiang Liu | Yuewen Cao | Dan Su | Helen Meng | Zhiyong Wu | Xixin Wu | Lifa Sun | Shiyin Kang | Dong Yu | Xunying Liu | Disong Wang | H. Meng | Xunying Liu | Dong Yu | Xixin Wu | Songxiang Liu | Yuewen Cao | Dan Su | Lifa Sun | Disong Wang | Zhiyong Wu | Shiyin Kang
[1] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Ricardo Gutierrez-Osuna,et al. Developing Objective Measures of Foreign-Accent Conversion , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Ricardo Gutierrez-Osuna,et al. Foreign accent conversion through voice morphing , 2013, INTERSPEECH.
[4] Ricardo Gutierrez-Osuna,et al. Accent Conversion Using Phonetic Posteriorgrams , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Junichi Yamagishi,et al. SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2016 .
[6] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[7] Hao Wang,et al. Phonetic posteriorgrams for many-to-one voice conversion without parallel data training , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[8] Ricardo Gutierrez-Osuna,et al. Foreign accent conversion in computer assisted pronunciation training , 2009, Speech Commun..
[9] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[10] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[11] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Li-Rong Dai,et al. Non-Parallel Sequence-to-Sequence Voice Conversion With Disentangled Linguistic and Speaker Representations , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[14] Milos Cernak,et al. End-to-End Accented Speech Recognition , 2019, INTERSPEECH.
[15] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[16] Mark Huckvale,et al. Spoken language conversion with accent morphing , 2007, SSW.
[17] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[18] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[19] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Ricardo Gutierrez-Osuna,et al. Can voice conversion be used to reduce non-native accents? , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Xunying Liu,et al. The HCCL-CUHK System for the Voice Conversion Challenge 2018 , 2018, Odyssey.
[22] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2017 .
[23] Xunying Liu,et al. Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams , 2019, INTERSPEECH.
[24] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[25] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Xunying Liu,et al. Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance , 2018, INTERSPEECH.
[27] Ricardo Gutierrez-Osuna,et al. Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.