Fftnet: A Real-Time Speaker-Dependent Neural Vocoder
暂无分享,去创建一个
Adam Finkelstein | Gautham J. Mysore | Zeyu Jin | Jingwan Lu | Jingwan Lu | A. Finkelstein | Zeyu Jin | G. Mysore | Adam Finkelstein | G. J. Mysore
[1] Homer Dudley,et al. The Vocoder—Electrical Re-Creation of Speech * --> , 1940 .
[2] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .
[3] S. Imai,et al. Mel Log Spectrum Approximation (MLSA) filter for speech synthesis , 1983 .
[4] T. Dutoit. An introduction to text-to-speech synthesis , 1997 .
[5] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.
[6] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[7] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[8] A. F. Machado,et al. VOICE CONVERSION: A CRITICAL SURVEY , 2010 .
[9] Michael D. Buhrmester,et al. Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.
[10] Daniela Braga,et al. Evaluating Voice Quality and Speech Synthesis Using Crowdsourcing , 2013, TSD.
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.
[13] M. Ramos. Voice Conversion with Deep Learning , 2016 .
[14] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[15] Thomas S. Huang,et al. Fast Wavenet Generation Algorithm , 2016, ArXiv.
[16] Stephen DiVerdi,et al. Cute: A concatenative method for voice conversion using exemplar-based unit selection , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Gautham J. Mysore,et al. Fast and easy crowdsourced perceptual audio evaluation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Stephen DiVerdi,et al. VoCo , 2017, ACM Trans. Graph..
[19] Samy Bengio,et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model , 2017, ArXiv.
[20] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[21] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[22] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[23] Karen Simonyan,et al. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders , 2017, ICML.
[24] Tomoki Toda,et al. Statistical Voice Conversion with WaveNet-Based Waveform Generation , 2017, INTERSPEECH.
[25] Xavier Serra,et al. A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.