Segment Level Voice Conversion with Recurrent Neural Networks
暂无分享,去创建一个
Ramón Fernández Astudillo | Isabel Trancoso | Alan W. Black | Nuno Fonseca | Miguel Varela Ramos | A. Black | I. Trancoso | M. Ramos | N. Fonseca
[1] Tetsuya Takiguchi,et al. Voice conversion using speaker-dependent conditional restricted Boltzmann machine , 2015, EURASIP Journal on Audio, Speech, and Music Processing.
[2] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[3] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[5] Moncef Gabbouj,et al. Voice Conversion Using Dynamic Kernel Partial Least Squares Regression , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[6] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[8] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[9] Yu Tsao,et al. A Study of Mutual Information for GMM-Based Spectral Conversion , 2012, INTERSPEECH.
[10] Kun Li,et al. Voice conversion using deep Bidirectional Long Short-Term Memory based Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Yu Tsao,et al. Incorporating global variance in the training phase of GMM-based voice conversion , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.
[12] Paul Taylor,et al. The architecture of the Festival speech synthesis system , 1998, SSW.
[13] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[14] Haizhou Li,et al. Exemplar-based voice conversion using joint nonnegative matrix factorization , 2015, Multimedia Tools and Applications.
[15] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.
[16] Joan Bruna,et al. Voice Conversion using Convolutional Neural Networks , 2016, ArXiv.
[17] Zhi Zheng Wu,et al. Spectral mapping for voice conversion , 2015 .
[18] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[19] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[20] Chng Eng Siong,et al. High quality voice conversion using prosodic and high-resolution spectral features , 2015, Multimedia Tools and Applications.