Speaker Adaptation of a Multilingual Acoustic Model for Cross-Language Synthesis
暂无分享,去创建一个
Sandesh Aryal | Pierre Lanchantin | Ivan Himawan | Iris Chuoying Ouyang | Sam Kang | Simon King | S. Aryal | P. Lanchantin | I. Ouyang | Ivan Himawan | Simon King | Sam Kang
[1] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[2] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[3] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[4] Frank K. Soong,et al. A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin–English) TTS , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[6] Heiga Zen,et al. Multi-Language Multi-Speaker Acoustic Modeling for LSTM-RNN Based Statistical Parametric Speech Synthesis , 2016, INTERSPEECH.
[7] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[8] Yusuke Ijima,et al. DNN-Based Speech Synthesis Using Speaker Codes , 2018, IEICE Trans. Inf. Syst..
[9] Sadaoki Furui,et al. New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer , 2006, Speech Commun..
[10] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[11] Sercan Ömer Arik,et al. Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning , 2017, ICLR.
[12] Keiichi Tokuda,et al. Using speaker adaptive training to realize Mandarin-Tibetan cross-lingual speech synthesis , 2014, Multimedia Tools and Applications.
[13] Frank K. Soong,et al. Speaker and language factorization in DNN-based TTS synthesis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Zhizheng Wu,et al. Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.
[15] Heiga Zen,et al. Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Lior Wolf,et al. Unsupervised Polyglot Text-to-speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Keiichi Tokuda,et al. Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.
[18] Sadaoki Furui,et al. New Approach to Polyglot Synthesis: How to Speak any Language with Anyone's Voice , 2006 .
[19] Ranniery Maia,et al. Speaker Adaptation in DNN-Based Speech Synthesis Using d-Vectors , 2017, INTERSPEECH.
[20] Zhengchen Zhang,et al. A light-weight method of building an LSTM-RNN-based bilingual tts system , 2017, 2017 International Conference on Asian Language Processing (IALP).
[21] Junichi Yamagishi,et al. Adapting and controlling DNN-based speech synthesis using input codes , 2017, ICASSP.
[22] Chung-Hsien Wu,et al. Polyglot Speech Synthesis Based on Cross-Lingual Frame Selection Using Auditory and Articulatory Features , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] Simon King,et al. Disentangling Style Factors from Speaker Representations , 2019, INTERSPEECH.
[24] Heiga Zen,et al. Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning , 2019, INTERSPEECH.
[25] Sercan Ömer Arik,et al. Neural Voice Cloning with a Few Samples , 2018, NeurIPS.
[26] Frank K. Soong,et al. Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] RECOMMENDATION ITU-R BS.1534-1 - Method for the subjective assessment of intermediate quality level of coding systems , 2003 .
[28] Lior Wolf,et al. VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop , 2017, ICLR.