Voice Morphing that improves TTS quality using an optimal dynamic frequency warping-and-weighting transform
暂无分享,去创建一个
[1] Hui Liang,et al. VTLN-Based Rapid Cross-Lingual Adaptation for Statistical Parametric Speech Synthesis , 2012 .
[2] Orhan Karaali,et al. Speech Synthesis with Neural Networks , 1998, ArXiv.
[3] Thomas Quatieri,et al. Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .
[4] Yannis Agiomyrgiannakis. The matching-minimization algorithm, the INCA algorithm and a mathematical framework for voice conversion with unaligned corpora , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Moncef Gabbouj,et al. Voice conversion for non-parallel datasets using dynamic kernel partial least squares regression , 2013, INTERSPEECH.
[6] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[7] Inma Hernáez,et al. Towards Physically Interpretable Parametric Voice Conversion Functions , 2013, NOLISP.
[8] Hideki Kawahara,et al. STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds , 2006 .
[9] Keiichi Tokuda,et al. Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[10] Tomoki Toda,et al. One-to-Many and Many-to-One Voice Conversion Based on Eigenvoices , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[11] Daniel Erro,et al. Voice Conversion Based on Weighted Frequency Warping , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Hui Ye,et al. High quality voice morphing , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[13] Zhizheng Wu,et al. A study of speaker adaptation for DNN-based speech synthesis , 2015, INTERSPEECH.
[14] Hui Liang,et al. Implementation of VTLN for statistical speech synthesis , 2010, SSW.
[15] Daniel Erro,et al. Weighted frequency warping for voice conversion , 2007, INTERSPEECH.
[16] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[18] Yannis Agiomyrgiannakis,et al. Vocaine the vocoder and applications in speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Olivier Rosec,et al. Voice Conversion Using Dynamic Frequency Warping With Amplitude Scaling, for Parallel or Nonparallel Corpora , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[21] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[22] Takao Kobayashi,et al. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[23] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[24] Inma Hernáez,et al. Improving the Quality of Standard GMM-Based Voice Conversion Systems by Considering Physically Motivated Linear Transformations , 2012, IberSPEECH.
[25] Tetsuya Takiguchi,et al. Voice conversion in high-order eigen space using deep belief nets , 2013, INTERSPEECH.
[26] Heiga Zen,et al. Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Junichi Yamagishi,et al. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2017 .
[28] Sadaoki Furui,et al. Automatic speech summarization applied to English broadcast news speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[29] Eric Moulines,et al. Voice transformation using PSOLA technique , 1991, Speech Commun..
[30] Hui Liang,et al. VTLN adaptation for statistical speech synthesis , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.