LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
暂无分享,去创建一个
Heiga Zen | Yu Zhang | Yonghui Wu | Zhifeng Chen | Ye Jia | Ron J. Weiss | Rob Clark | Viet Dang | Z. Chen | Yonghui Wu | H. Zen | R. Clark | Yu Zhang | Ye Jia | Viet-Trung Dang | Viet Dang
[1] Szu-Lin Wu,et al. Improving Unsupervised Style Transfer in end-to-end Speech Synthesis with end-to-end Speech Recognition , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[2] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[3] Yuxuan Wang,et al. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron , 2018, ICML.
[4] Lior Wolf,et al. Fitting New Speakers Based on a Short Untranscribed Sample , 2018, ICML.
[5] Sercan Ömer Arik,et al. Neural Voice Cloning with a Few Samples , 2018, NeurIPS.
[6] Samy Bengio,et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model , 2017, ArXiv.
[7] Ming Zhou,et al. Close to Human Quality TTS with Transformer , 2018, ArXiv.
[8] Richard M. Stern,et al. Robust signal-to-noise ratio estimation based on waveform amplitude distribution analysis , 2008, INTERSPEECH.
[9] Yoshua Bengio,et al. Representation Mixing for TTS Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Yuxuan Wang,et al. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis , 2018, ICML.
[11] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[12] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[13] Soo-Young Lee,et al. Voice Imitating Text-to-Speech Neural Networks , 2018, ArXiv.
[14] Heiga Zen,et al. Hierarchical Generative Modeling for Controllable Speech Synthesis , 2018, ICLR.
[15] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[16] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[17] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Pedro J. Moreno,et al. A factor automaton approach for the forced alignment of long speech recordings , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[19] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2017 .
[20] Heiga Zen,et al. Sample Efficient Adaptive Text-to-Speech , 2018, ICLR.
[21] Sercan Ömer Arik,et al. Deep Voice 3: 2000-Speaker Neural Text-to-Speech , 2017, ICLR 2018.
[22] J. Hintze,et al. Violin plots : A box plot-density trace synergism , 1998 .
[23] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[24] Yuxuan Wang,et al. Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[25] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[26] Richard Sproat,et al. The Kestrel TTS text normalization system , 2014, Natural Language Engineering.
[27] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Zhizheng Wu,et al. Siri On-Device Deep Learning-Guided Unit Selection Text-to-Speech System , 2017, INTERSPEECH.
[29] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[30] Yuxuan Wang,et al. Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Joseph P. Olive,et al. Text-to-speech synthesis , 1995, AT&T Technical Journal.
[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[33] Lior Wolf,et al. Voice Synthesis for in-the-Wild Speakers via a Phonological Loop , 2017, ArXiv.
[34] S. King,et al. The Blizzard Challenge 2011 , 2011 .
[35] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.