暂无分享,去创建一个
Junichi Yamagishi | Nobuyuki Nishizawa | Xin Wang | Hieu-Thi Luong | J. Yamagishi | Xin Wang | Nobuyuki Nishizawa | Hieu-Thi Luong
[1] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[2] Simon King,et al. Thousands of Voices for HMM-Based Speech Synthesis–Analysis and Application of TTS Systems Built on Various ASR Corpora , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Srikanth Ronanki,et al. Effect of Data Reduction on Sequence-to-sequence Neural TTS , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Junichi Yamagishi,et al. An autoregressive recurrent mixture density network for parametric speech synthesis , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Pierre Lanchantin,et al. Data Selection for Improving Naturalness of TTS Voices Trained on Small Found Corpuses , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[6] Xin Wang,et al. Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects , 2018, INTERSPEECH.
[7] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[8] Wei Zhang,et al. Corpus building for data-driven TTS systems , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..
[9] Yevgen Chebotar,et al. Distilling Knowledge from Ensembles of Neural Networks for Speech Recognition , 2016, INTERSPEECH.
[10] Thierry Dutoit,et al. Text design for TTS speech corpus building using a modified greedy selection , 2003, INTERSPEECH.
[11] Leo Breiman,et al. Stacked regressions , 2004, Machine Learning.
[12] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Heiga Zen,et al. Sample Efficient Adaptive Text-to-Speech , 2018, ICLR.
[14] Julia Hirschberg,et al. Utterance Selection for Optimizing Intelligibility of TTS Voices Trained on ASR Data , 2017, INTERSPEECH.
[15] Xin Wang,et al. Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] David H. Wolpert,et al. Stacked generalization , 1992, Neural Networks.
[17] Tomoki Toda,et al. An investigation of multi-speaker training for wavenet vocoder , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[18] Li Deng,et al. Ensemble deep learning for speech recognition , 2014, INTERSPEECH.
[19] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[20] Ponnuthurai N. Suganthan,et al. Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article] , 2016, IEEE Computational Intelligence Magazine.
[21] Frank K. Soong,et al. Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Victor Ungureanu,et al. Experiments with Training Corpora for Statistical Text-to-speech Systems , 2018, INTERSPEECH.
[23] Nobuaki Minematsu,et al. Speaker Representations for Speaker Adaptation in Multiple Speakers' BLSTM-RNN-Based Speech Synthesis , 2016, INTERSPEECH.
[24] Yusuke Ijima,et al. DNN-Based Speech Synthesis Using Speaker Codes , 2018, IEICE Trans. Inf. Syst..
[25] Julia Hirschberg,et al. A Comparison of Speaker-based and Utterance-based Data Selection for Text-to-Speech Synthesis , 2018, INTERSPEECH.
[26] Nathalie Japkowicz,et al. The Class Imbalance Problem: Significance and Strategies , 2000 .
[27] Stan Matwin,et al. Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.