MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
暂无分享,去创建一个
Yoshua Bengio | Aaron C. Courville | Aaron Courville | Kundan Kumar | Rithesh Kumar | Thibault de Boissiere | Lucas Gestin | Wei Zhen Teoh | Jose Sotelo | Alexandre de Brebisson | Yoshua Bengio | A. D. Brébisson | Kundan Kumar | Rithesh Kumar | Jose M. R. Sotelo | T. Boissière | L. Gestin
[1] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[2] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[3] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.
[4] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[5] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.
[6] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Thomas Brox,et al. Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.
[8] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.
[9] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[10] Junichi Yamagishi,et al. SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2016 .
[11] Vincent Dumoulin,et al. Deconvolution and Checkerboard Artifacts , 2016 .
[12] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[13] Ole Winther,et al. Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.
[14] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[15] Jae Hyun Lim,et al. Geometric GAN , 2017, ArXiv.
[16] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[17] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[18] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[20] Karen Simonyan,et al. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders , 2017, ICML.
[21] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[22] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[24] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[25] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[26] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[27] Marco Maggioni,et al. Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking , 2018, ArXiv.
[28] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Prafulla Dhariwal,et al. Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.
[30] Zaïd Harchaoui,et al. Invariances and Data Augmentation for Supervised Music Transcription , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Karen Simonyan,et al. The challenge of realistic music generation: modelling raw audio at scale , 2018, NeurIPS.
[32] Chris Donahue,et al. Synthesizing Audio with Generative Adversarial Networks , 2018, ArXiv.
[33] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.
[34] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[35] Jan Kautz,et al. Video-to-Video Synthesis , 2018, NeurIPS.
[36] Harshad Rai,et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .
[37] Sercan Ömer Arik,et al. Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning , 2017, ICLR.
[38] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[39] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[40] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[41] Chris Donahue,et al. Adversarial Audio Synthesis , 2018, ICLR.
[42] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Eunwoo Song,et al. Probability density distillation with generative adversarial networks for high-quality parallel waveform generation , 2019, INTERSPEECH.
[44] Shlomo Dubnov,et al. Expediting TTS Synthesis with Adversarial Vocoding , 2019, INTERSPEECH.
[45] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.
[46] Kumar Krishna Agrawal,et al. GANSynth: Adversarial Neural Audio Synthesis , 2019, ICLR.
[47] Taesung Park,et al. Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Han Zhang,et al. Self-Attention Generative Adversarial Networks , 2018, ICML.
[49] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Alexei A. Efros,et al. Everybody Dance Now , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).