Improving transfer of expressivity for end-to-end multispeaker text-to-speech synthesis
暂无分享,去创建一个
[1] Hasan Şakir Bilge,et al. Deep Metric Learning: A Survey , 2019, Symmetry.
[2] Damien Lolive,et al. SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis , 2018, LREC.
[3] Denis Jouvet,et al. Transfer Learning of the Expressivity Using FLOW Metric Learning in Multispeaker Text-to-Speech Synthesis , 2020, INTERSPEECH.
[4] Zhen-Hua Ling,et al. Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Yutaka Matsuo,et al. Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder , 2018, INTERSPEECH.
[6] Yuxuan Wang,et al. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron , 2018, ICML.
[7] Kihyuk Sohn,et al. Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.
[8] Stefan Winkler,et al. Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives , 2016, Multimedia Systems.
[9] Denis Jouvet,et al. Deep Variational Metric Learning for Transfer of Expressivity in Multispeaker Text to Speech , 2020, SLSP.
[10] Marcela Charfuelan,et al. Expressive speech synthesis in MARY TTS using audiobook data and emotionML , 2013, INTERSPEECH.
[11] Taesu Kim,et al. Robust and Fine-grained Prosody Control of End-to-end Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Junichi Yamagishi,et al. The SIWIS French Speech Synthesis Database , 2017 .
[13] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Samy Bengio,et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model , 2017, ArXiv.
[15] Xudong Lin,et al. Deep Variational Metric Learning , 2018, ECCV.
[16] Thierry Dutoit,et al. Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis , 2019, INTERSPEECH.
[17] Marius Cotescu,et al. Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Slim Ouni,et al. Conditional Variational Auto-Encoder for Text-Driven Expressive AudioVisual Speech Synthesis , 2019, INTERSPEECH.
[19] R. Sarpong,et al. Bio-inspired synthesis of xishacorenes A, B, and C, and a new congener from fuscol† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c9sc02572c , 2019, Chemical science.
[20] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[22] Heiga Zen,et al. Hierarchical Generative Modeling for Controllable Speech Synthesis , 2018, ICLR.
[23] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.