ON the Use of Wavenet as a Statistical Vocoder
暂无分享,去创建一个
Vassilis Tsiaras | Yannis Stylianou | Nagaraj Adiga | Y. Stylianou | Vassilis Tsiaras | Nagaraj Adiga
[1] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[2] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[3] J. Makhoul,et al. Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.
[4] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[5] Mark Hasegawa-Johnson,et al. Speech Enhancement Using Bayesian Wavenet , 2017, INTERSPEECH.
[6] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[7] Samy Bengio,et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model , 2017, ArXiv.
[8] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[9] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Zhizheng Wu,et al. Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.
[11] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..
[12] Heiga Zen,et al. Speech Synthesis Based on Hidden Markov Models , 2013, Proceedings of the IEEE.
[13] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[14] Heiga Zen,et al. Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends , 2015, IEEE Signal Processing Magazine.
[15] Milos Cernak,et al. An Evaluation of Synthetic Speech Using the PESQ Measure , 2005 .
[16] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Tomoki Toda,et al. Statistical Voice Conversion with WaveNet-Based Waveform Generation , 2017, INTERSPEECH.
[18] S. R. Mahadeva Prasanna,et al. Source modeling for HMM based speech synthesis using integrated LP residual , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[20] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.
[21] Xavier Serra,et al. A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.