RobuTrans: A Robust Transformer-Based Text-to-Speech Model
暂无分享,去创建一个
Ming Liu | Yanqing Liu | Sheng Zhao | Naihan Li | Yu Wu | Shujie Liu | Yu Wu | Shujie Liu | Sheng Zhao | Naihan Li | Ming Liu | Yanqing Liu
[1] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[2] Jan Skoglund,et al. LPCNET: Improving Neural Speech Synthesis through Linear Prediction , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] Francis Charpentier,et al. Diphone synthesis using an overlap-add technique for speech waveforms concatenation , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[5] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[6] Samy Bengio,et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model , 2017, ArXiv.
[7] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[8] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[9] Heiga Zen,et al. Speech Synthesis Based on Hidden Markov Models , 2013, Proceedings of the IEEE.
[10] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[12] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[14] Lei Xie,et al. Pre-Alignment Guided Attention for Improving Training Efficiency and Model Stability in End-to-End Speech Synthesis , 2019, IEEE Access.
[15] Luke S. Zettlemoyer,et al. Transformers with convolutional context for ASR , 2019, ArXiv.
[16] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[17] Li-Rong Dai,et al. Forward Attention in Sequence- To-Sequence Acoustic Modeling for Speech Synthesis , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Heiga Zen,et al. Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters , 2010, SSW.
[19] Zhizheng Wu,et al. Automatic prosody prediction and detection with Conditional Random Field (CRF) models , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Colin Raffel,et al. Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.
[22] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[23] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[24] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[25] Werner Verhelst,et al. An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[26] Shujie Liu,et al. Neural Speech Synthesis with Transformer Network , 2018, AAAI.
[27] Xu Tan,et al. FastSpeech: Fast, Robust and Controllable Text to Speech , 2019, NeurIPS.
[28] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.
[29] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.