Voice source modelling using deep neural networks for statistical parametric speech synthesis
暂无分享,去创建一个
Paavo Alku | Martti Vainio | John Kane | Simon King | Heng Lu | Tuomo Raitio | Antti Suni | P. Alku | Heng Lu | M. Vainio | John Kane | T. Raitio | Simon King | Antti Suni
[1] Hideki Kawahara,et al. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT , 2001, MAVEBA.
[2] J. Makhoul,et al. Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.
[3] Nam Soo Kim,et al. Excitation modeling based on waveform interpolation for HMM-based speech synthesis , 2010, INTERSPEECH.
[4] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[5] Junichi Yamagishi,et al. Towards an improved modeling of the glottal source in statistical parametric speech synthesis , 2007, SSW.
[6] Heiga Zen,et al. An excitation model for HMM-based speech synthesis based on residual modeling , 2007, SSW.
[7] Paavo Alku,et al. HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Paavo Alku,et al. HMM-based Finnish text-to-speech system utilizing glottal inverse filtering , 2008, INTERSPEECH.
[9] Thierry Dutoit,et al. The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11] Junichi Yamagishi,et al. Glottal spectral separation for parametric speech synthesis , 2008, INTERSPEECH.
[12] Thierry Dutoit,et al. A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis , 2019, INTERSPEECH.
[13] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[14] Paavo Alku,et al. Comparing glottal-flow-excited statistical parametric speech synthesis methods , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[15] J. Holmes,et al. The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer , 1973 .
[16] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[17] Minsoo Hahn,et al. Two-Band Excitation for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[18] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[19] Heiga Zen,et al. Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters , 2010, SSW.
[20] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[21] Paavo Alku,et al. Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering , 1991, Speech Commun..
[22] Keiichi Tokuda,et al. Speaker interpolation in HMM-based speech synthesis system , 1997, EUROSPEECH.
[23] Paavo Alku,et al. Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Thierry Dutoit,et al. Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[25] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[26] G. Fries. Hybrid time- and frequency-domain speech synthesis with extended glottal source generation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[27] Kenji Matsui,et al. Improving naturalness in text-to-speech synthesis using natural glottal source , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.