Speaker Adaptation in DNN-Based Speech Synthesis Using d-Vectors
暂无分享,去创建一个
Ranniery Maia | Norbert Braunschweiler | Rama Doddipatla | N. Braunschweiler | R. Maia | Rama Doddipatla
[1] Heiga Zen,et al. Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Frank K. Soong,et al. Unsupervised speaker adaptation for DNN-based TTS synthesis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] Mike Brookes,et al. Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Moncef Gabbouj,et al. Ways to Implement Global Variance in Statistical Speech Synthesis , 2012, INTERSPEECH.
[6] Florin Curelaru,et al. Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).
[7] Arturo Camacho Lozano,et al. SWIPE: A Sawtooth Waveform Inspired Pitch Estimator for Speech and Music , 2011 .
[8] Simon King,et al. Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Florian Metze,et al. Towards speaker adaptive training of deep neural network acoustic models , 2014, INTERSPEECH.
[10] Mark J. F. Gales,et al. Photo-realistic expressive text to talking head synthesis , 2013, INTERSPEECH.
[11] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, IEEE International Conference on Acoustics, Speech, and Signal Processing.
[12] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[13] Yusuke Ijima,et al. An Investigation of DNN-Based Speech Synthesis Using Speaker Codes , 2016, INTERSPEECH.
[14] Mark J. F. Gales,et al. Complex cepstrum for statistical parametric speech synthesis , 2013, Speech Commun..
[15] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Takao Kobayashi,et al. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[18] Frank K. Soong,et al. Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Heiga Zen,et al. Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Nobuaki Minematsu,et al. Speaker Representations for Speaker Adaptation in Multiple Speakers' BLSTM-RNN-Based Speech Synthesis , 2016, INTERSPEECH.
[21] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[22] Frank K. Soong,et al. TTS synthesis with bidirectional LSTM based recurrent neural networks , 2014, INTERSPEECH.
[23] Zhizheng Wu,et al. A study of speaker adaptation for DNN-based speech synthesis , 2015, INTERSPEECH.
[24] Frank K. Soong,et al. Speaker and language factorization in DNN-based TTS synthesis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).