Controlling Emotion Strength with Relative Attribute for End-to-End Speech Synthesis
暂无分享,去创建一个
Geng Yang | Shan Yang | Lei Xie | Xiaolian Zhu | Lei Xie | Shan Yang | Geng Yang | Xiaolian Zhu
[1] Lianhong Cai,et al. Emphatic Speech Synthesis and Control Based on Characteristic Transferring in End-to-End Speech Synthesis , 2018, 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia).
[2] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[3] Ming Zhou,et al. Close to Human Quality TTS with Transformer , 2018, ArXiv.
[4] Björn W. Schuller,et al. The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.
[5] Junichi Yamagishi,et al. Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis , 2018, Speech Commun..
[6] Yamato Ohtani,et al. Emotional transplant in statistical speech synthesis based on emotion additive model , 2015, INTERSPEECH.
[7] Minsoo Hahn,et al. Multi-speaker Emotional Acoustic Modeling for CNN-based Speech Synthesis , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Björn W. Schuller,et al. Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.
[9] Kristen Grauman,et al. Relative attributes , 2011, 2011 International Conference on Computer Vision.
[10] Xin Wang,et al. Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis , 2018, ArXiv.
[11] Yuxuan Wang,et al. Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[12] Samy Bengio,et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model , 2017, ArXiv.
[13] Masanobu Abe,et al. An investigation to transplant emotional expressions in DNN-based TTS synthesis , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[14] Junichi Yamagishi,et al. Adapting and controlling DNN-based speech synthesis using input codes , 2017, ICASSP.
[15] Yuxuan Wang,et al. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron , 2018, ICML.
[16] Soo-Young Lee,et al. Emotional End-to-End Neural Speech Synthesizer , 2017, NIPS 2017.
[17] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[18] Andrew Zisserman,et al. Learning Visual Attributes , 2007, NIPS.
[19] Sang Wan Lee,et al. Phonemic-level Duration Control Using Attention Alignment for Natural Speech Synthesis , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Lirong Dai,et al. Emotional statistical parametric speech synthesis using LSTM-RNNs , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[21] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[22] Soroosh Mariooryad,et al. Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis , 2019, ArXiv.
[23] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Ying Chen,et al. Implementing Prosodic Phrasing in Chinese End-to-end Speech Synthesis , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Olivier Chapelle,et al. Training a Support Vector Machine in the Primal , 2007, Neural Computation.
[26] Shrikanth S. Narayanan,et al. Expressive speech synthesis using a concatenative synthesizer , 2002, INTERSPEECH.
[27] Zhen-Hua Ling,et al. Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Taesu Kim,et al. Robust and Fine-grained Prosody Control of End-to-end Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[30] Vincent Wan,et al. CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network , 2019, ICML.
[31] Zhong-Qiu Wang,et al. Learning utterance-level representations for speech emotion and age/gender recognition using deep neural networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).