Adversarially Trained End-to-end Korean Singing Voice Synthesis System
暂无分享,去创建一个
Kyogu Lee | Juheon Lee | Hyeong-Seok Choi | Junghyun Koo | Chang-Bin Jeon | Kyogu Lee | Juheon Lee | Junghyun Koo | Hyeong-Seok Choi | Chang-Bin Jeon
[1] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[2] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[3] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[4] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[5] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[6] Yoshihiko Nankaku,et al. Singing Voice Synthesis Based on Deep Neural Networks , 2016, INTERSPEECH.
[7] Yuxuan Wang,et al. Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[9] Sangjin Kim,et al. Korean Singing Voice Synthesis System based on an LSTM Recurrent Neural Network , 2018 .
[10] Hideyuki Tachibana,et al. Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Jong-Jin Kim,et al. Korean Singing Voice Synthesis Based on an LSTM Recurrent Neural Network , 2018, INTERSPEECH.
[12] Jordi Bonada,et al. A Neural Parametric Singing Synthesizer Modeling Timbre and Expression from Natural Songs , 2017 .
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[15] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.
[16] Stephane Villette,et al. Speech Bandwidth Extension Using Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[18] Mark A. Clements,et al. Concatenation-Based MIDI-to-Singing Voice Synthesis , 1997 .
[19] Jae Lim,et al. Signal estimation from modified short-time Fourier transform , 1984 .
[20] Lars M. Mescheder,et al. On the convergence properties of GAN training , 2018, ArXiv.
[21] Takeru Miyato,et al. cGANs with Projection Discriminator , 2018, ICLR.
[22] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[23] Sebastian Nowozin,et al. Which Training Methods for GANs do actually Converge? , 2018, ICML.
[24] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[25] Yuxuan Wang,et al. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron , 2018, ICML.
[26] Hideki Kenmochi,et al. VOCALOID - commercial singing synthesizer based on sample concatenation , 2007, INTERSPEECH.