Non-Autoregressive Neural Text-to-Speech
暂无分享,去创建一个
[1] Xu Tan,et al. FastSpeech: Fast, Robust and Controllable Text to Speech , 2019, NeurIPS.
[2] Jason Lee,et al. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.
[3] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[4] Rob Fergus,et al. Stochastic Video Generation with a Learned Prior , 2018, ICML.
[5] Wei Ping,et al. WaveFlow: A Compact Flow-based Model for Raw Audio , 2019, ICML.
[6] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[7] Heiga Zen,et al. Hierarchical Generative Modeling for Controllable Speech Synthesis , 2018, ICLR.
[8] Cha Zhang,et al. CROWDMOS: An approach for crowdsourcing mean opinion score studies , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[10] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[11] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Heiga Zen,et al. Sample Efficient Adaptive Text-to-Speech , 2018, ICLR.
[13] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.
[14] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[15] Yoshua Bengio,et al. NICE: Non-linear Independent Components Estimation , 2014, ICLR.
[16] Max Welling,et al. Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.
[17] Victor O. K. Li,et al. Non-Autoregressive Neural Machine Translation , 2017, ICLR.
[18] Shakir Mohamed,et al. Variational Inference with Normalizing Flows , 2015, ICML.
[19] Lior Wolf,et al. VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop , 2017, ICLR.
[20] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[21] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[22] Alexandre Lacoste,et al. Probability Distillation: A Caveat and Alternatives , 2019, UAI.
[23] Shujie Liu,et al. Neural Speech Synthesis with Transformer Network , 2018, AAAI.
[24] Lior Wolf,et al. Fitting New Speakers Based on a Short Untranscribed Sample , 2018, ICML.
[25] Sercan Ömer Arik,et al. Neural Voice Cloning with a Few Samples , 2018, NeurIPS.
[26] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[28] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[29] Sungwon Kim,et al. FloWaveNet : A Generative Flow for Raw Audio , 2018, ICML.
[30] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[32] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[33] Eunwoo Song,et al. Probability density distillation with generative adversarial networks for high-quality parallel waveform generation , 2019, INTERSPEECH.
[34] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[35] Prafulla Dhariwal,et al. Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.
[36] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.
[37] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[38] Xin Wang,et al. Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[40] Sercan Ömer Arik,et al. Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning , 2017, ICLR.
[41] Gregory Diamos,et al. Fast Spectrogram Inversion Using Multi-Head Convolutional Neural Networks , 2018, IEEE Signal Processing Letters.
[42] Joseph P. Olive,et al. Text-to-speech synthesis , 1995, AT&T Technical Journal.
[43] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[44] Yoshua Bengio,et al. MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis , 2019, NeurIPS.
[45] Erich Elsen,et al. High Fidelity Speech Synthesis with Adversarial Networks , 2019, ICLR.
[46] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[47] Aurko Roy,et al. Fast Decoding in Sequence Models using Discrete Latent Variables , 2018, ICML.
[48] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.