暂无分享,去创建一个
Tuomo Raitio | Ramya Rasipuram | Ladan Golipour | David Winarsky | Sivanand Achanta | Hepeng Zhang | Jiangchuan Li | Albert Antony | Francesco Rossi | Jennifer Shi | Jaimin Upadhyay | Sivanand Achanta | T. Raitio | L. Golipour | Ramya Rasipuram | Jiangchuan Li | Albert Antony | Francesco Rossi | Jennifer Shi | Jaimin Upadhyay | David Winarsky | Hepeng Zhang
[1] Erich Elsen,et al. End-to-End Adversarial Text-to-Speech , 2020, ArXiv.
[2] Thomas Drugman,et al. Towards Achieving Robust Universal Neural Vocoding , 2018, INTERSPEECH.
[3] Tao Qin,et al. FastSpeech 2: Fast and High-Quality End-to-End Text to Speech , 2021, ICLR.
[4] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[5] Jan Skoglund,et al. LPCNET: Improving Neural Speech Synthesis through Linear Prediction , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[7] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[8] Yoshua Bengio,et al. MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis , 2019, NeurIPS.
[9] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[10] Sercan Ömer Arik,et al. Deep Voice 3: 2000-Speaker Neural Text-to-Speech , 2017, ICLR 2018.
[11] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[12] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[13] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.
[14] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[15] Soroosh Mariooryad,et al. Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis , 2020, ArXiv.
[16] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[17] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[19] Heiga Zen,et al. Parallel Tacotron: Non-Autoregressive and Controllable TTS , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Colin Raffel,et al. Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.
[21] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[22] Lei He,et al. Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS , 2019, INTERSPEECH.
[23] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Xu Tan,et al. FastSpeech: Fast, Robust and Controllable Text to Speech , 2019, NeurIPS.
[25] Zhizheng Wu,et al. Siri On-Device Deep Learning-Guided Unit Selection Text-to-Speech System , 2017, INTERSPEECH.