Model-Based Parametric Prosody Synthesis with Deep Neural Network
暂无分享,去创建一个
[1] Heiga Zen,et al. Speech Synthesis Based on Hidden Markov Models , 2013, Proceedings of the IEEE.
[2] Frank K. Soong,et al. Modeling pitch trajectory by hierarchical HMM with minimum generation error training , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Santitham Prom-on,et al. Toward invariant functional representations of variable surface fundamental frequency contours: Synthesizing speech melody via model-based stochastic learning , 2014, Speech Commun..
[4] Frank K. Soong,et al. A hierarchical F0 modeling method for HMM-based speech synthesis , 2010, INTERSPEECH.
[5] Keiichi Tokuda,et al. Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[6] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[7] Heiga Zen,et al. Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends , 2015, IEEE Signal Processing Magazine.
[8] Philip N. Garner,et al. Convolutional Pitch Target Approximation Model for Speech Synthesis , 2013 .
[9] Emily Q. Wang,et al. Pitch targets and their realization: Evidence from Mandarin Chinese , 2001, Speech Commun..
[10] Lianhong Cai,et al. Modeling pitch contour of Chinese Mandarin sentences with the PENTA model , 2012 .
[11] Jorge J. Moré,et al. The Levenberg-Marquardt algo-rithm: Implementation and theory , 1977 .
[12] Frank K. Soong,et al. TTS synthesis with bidirectional LSTM based recurrent neural networks , 2014, INTERSPEECH.
[13] Heiga Zen,et al. Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Heiga Zen,et al. Context-dependent additive log f_0 model for HMM-based speech synthesis , 2009, INTERSPEECH.
[15] Santitham Prom-on,et al. Modeling tone and intonation in Mandarin and English as a process of target approximation. , 2009, The Journal of the Acoustical Society of America.
[16] Frank K. Soong,et al. On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Zhen-Hua Ling,et al. Vowel Creation by Articulatory Control in HMM-based Parametric Speech Synthesis , 2012, INTERSPEECH.
[18] Frank K. Soong,et al. Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree , 2014, INTERSPEECH.
[19] Ren-Hua Wang,et al. Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge , 2008, INTERSPEECH.
[20] Patricia Riddle,et al. Modelling and synthesising F0 contours with the discrete cosine transform , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[22] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[23] Heiga Zen,et al. Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences , 2007, Comput. Speech Lang..
[24] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[25] Xihong Wu,et al. Hierarchical pitch target model for Mandarin speech , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.
[26] Heiga Zen,et al. Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Yi Xu,et al. Speech melody as articulatorily implemented communicative functions , 2005, Speech Commun..
[28] Helen M. Meng,et al. Multi-distribution deep belief network for speech synthesis , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[29] Ren-Hua Wang,et al. Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Keiichi Tokuda,et al. Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis , 2004, SSW.
[31] M. Newville,et al. Lmfit: Non-Linear Least-Square Minimization and Curve-Fitting for Python , 2014 .
[32] Kai Yu,et al. Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[33] David Talkin,et al. A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .
[34] Zhizheng Wu,et al. Modeling and Generating Tone Contour with Phrase Intonation for Mandarin Chinese Speech , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.
[35] Dong Yu,et al. Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[36] Li-Rong Dai,et al. Improving F0 prediction using bidirectional associative memories and syllable-level F0 features for HMM-based Mandarin speech synthesis , 2014, The 9th International Symposium on Chinese Spoken Language Processing.
[37] Zhen-Hua Ling,et al. Articulatory Control of HMM-Based Parametric Speech Synthesis Using Feature-Space-Switched Multiple Regression , 2013, IEEE Transactions on Audio, Speech, and Language Processing.