Fast Spectrogram Inversion Using Multi-Head Convolutional Neural Networks
暂无分享,去创建一个
[1] Julius O. Smith,et al. Estimating a Signal from a Magnitude Spectrogram via Convex Optimization , 2012, 1209.2076.
[2] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[4] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[5] Francesco Visin,et al. A guide to convolution arithmetic for deep learning , 2016, ArXiv.
[6] Peter L. Søndergaard,et al. A fast Griffin-Lim algorithm , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[7] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[8] Jianliang Xu,et al. GPURoofline: A Model for Guiding Performance Optimizations on GPUs , 2012, Euro-Par.
[9] Lonce L. Wyse,et al. Single Pass Spectrogram Inversion , 2015, 2015 IEEE International Conference on Digital Signal Processing (DSP).
[10] Sercan Ömer Arik,et al. Neural Voice Cloning with a Few Samples , 2018, NeurIPS.
[11] Samy Bengio,et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model , 2017, ArXiv.
[12] Patrick Pérez,et al. Audio Style Transfer , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Chris Donahue,et al. Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[15] Vincent Dumoulin,et al. Deconvolution and Checkerboard Artifacts , 2016 .
[16] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[18] Karen Simonyan,et al. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders , 2017, ICML.
[19] Sercan Ömer Arik,et al. Deep Voice 3: 2000-Speaker Neural Text-to-Speech , 2017, ICLR 2018.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[22] Zdenek Prusa,et al. A Noniterative Method for Reconstruction of Phase From STFT Magnitude , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.