Effect of Data Reduction on Sequence-to-sequence Neural TTS
暂无分享,去创建一个
Srikanth Ronanki | Thomas Drugman | Thomas Merritt | Jaime Lorenzo-Trueba | Javier Latorre | Jakub Lachowicz | Klimkov Viacheslav | Jaime Lorenzo-Trueba | Thomas Drugman | Javier Latorre | Thomas Merritt | S. Ronanki | Jakub Lachowicz | Klimkov Viacheslav
[1] Simon King,et al. Robustness of HMM-based speech synthesis , 2008, INTERSPEECH.
[2] Simon King,et al. Measuring a decade of progress in Text-to-Speech , 2014 .
[3] YamagishiJunichi,et al. Thousands of voices for HMM-based speech synthesis , 2010 .
[4] Thomas Drugman,et al. Robust universal neural vocoding , 2018, ArXiv.
[5] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[6] W. Marsden. I and J , 2012 .
[7] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[8] Victor Ungureanu,et al. Experiments with Training Corpora for Statistical Text-to-speech Systems , 2018, INTERSPEECH.
[9] Adam Nadolski,et al. Comprehensive Evaluation of Statistical Speech Waveform Synthesis , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[10] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[11] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[12] Takao Kobayashi,et al. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Junichi Yamagishi,et al. Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis , 2018, Speech Commun..
[14] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[15] Sercan Ömer Arik,et al. Deep Voice 3: 2000-Speaker Neural Text-to-Speech , 2017, ICLR 2018.
[16] Simon King,et al. Statistical analysis of the Blizzard Challenge 2007 listening test results , 2007 .
[17] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[18] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Simon King,et al. Thousands of Voices for HMM-Based Speech Synthesis–Analysis and Application of TTS Systems Built on Various ASR Corpora , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[21] Lior Wolf,et al. VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop , 2017, ICLR.
[22] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[23] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.