Rakugo speech synthesis using segment-to-segment neural transduction and style tokens — toward speech synthesis for entertaining audiences
暂无分享,去创建一个
[1] Xin Wang,et al. Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments , 2019, ArXiv.
[2] Wang Xin,et al. Use and evaluation of Tacotron and context features in rakugo speech synthesis , 2019 .
[3] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[4] Shujie Liu,et al. Neural Speech Synthesis with Transformer Network , 2018, AAAI.
[5] Yoshua Bengio,et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.
[6] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Li-Rong Dai,et al. Forward Attention in Sequence- To-Sequence Acoustic Modeling for Speech Synthesis , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Sercan Ö. Arik,et al. Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning , 2017, ICLR.
[9] Lauri Juvela,et al. A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Lei Yu,et al. Online Segment to Segment Neural Transduction , 2016, EMNLP.
[11] Xin Wang,et al. Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[13] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[14] Heiga Zen,et al. Hierarchical Generative Modeling for Controllable Speech Synthesis , 2018, ICLR.
[15] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[16] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[17] Lior Wolf,et al. VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop , 2017, ICLR.