Personalized Spontaneous Speech Synthesis Using a Small-Sized Unsegmented Semispontaneous Speech
暂无分享,去创建一个
Chung-Hsien Wu | Yan-You Chen | Jhing-Fa Wang | Yi-Chin Huang | Ming-Ge Shie | Chung-Hsien Wu | Yan-You Chen | Jhing-Fa Wang | Yi-Chin Huang | Ming-Ge Shie
[1] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[2] Simon King,et al. Detection of phonological features in continuous speech using neural networks , 2000, Comput. Speech Lang..
[3] Gayatri M. Bhandari,et al. Audio Segmentation for Speech Recognition Using Segment Features , 2014 .
[4] Mari Ostendorf,et al. Moving beyond the 'beads-on-a-string' model of speech , 1999 .
[5] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[6] Geoffrey Zweig,et al. A segmental CRF approach to large vocabulary continuous speech recognition , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[7] Chin-Hui Lee,et al. Toward a detector-based universal phone recognizer , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] Takashi Nose,et al. Conversational spontaneous speech synthesis using average voice model , 2010, INTERSPEECH.
[9] Asaf Rendel,et al. Towards automatic phonetic segmentation for TTS , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Joan Claudi Socoró,et al. Voice Quality Modelling for Expressive Speech Synthesis , 2014, TheScientificWorldJournal.
[11] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[12] Chung-Hsien Wu,et al. Multiple change-point audio segmentation and classification using an MDL-based Gaussian model , 2006, IEEE Trans. Speech Audio Process..
[13] Chung-Hsien Wu,et al. Synthesis of Spontaneous Speech With Syllable Contraction Using State-Based Context-Dependent Voice Transformation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Alan W. Black,et al. Prediction of pronunciation variations for speech synthesis: a data-driven approach , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[15] Tanja Schultz,et al. Multilingual articulatory features , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[16] Mari Ostendorf,et al. Joint prosody prediction and unit selection for concatenative speech synthesis , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[17] Mauro Cettolo,et al. Evaluation of BIC-based algorithms for audio segmentation , 2005, Comput. Speech Lang..
[18] Takao Kobayashi,et al. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Takashi Nose,et al. On the Use of Extended Context for HMM-Based Spontaneous Conversational Speech Synthesis , 2011, INTERSPEECH.
[20] Tomoki Toda,et al. Parameter generation algorithm considering Modulation Spectrum for HMM-based speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Tuomo Raitio,et al. A Deep Generative Architecture for Postfiltering in Statistical Parametric Speech Synthesis , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[22] Gernot A. Fink,et al. Combining acoustic and articulatory feature information for robust speech recognition , 2002, Speech Commun..
[23] Chiu-yu Tseng. Speech Rate and Prosody Units: Evidence of Interaction from Mandarin Chinese , 2003 .
[24] Keiichi Tokuda,et al. An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[25] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[26] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[27] Richard M. Schwartz,et al. Practical Implementations of Speaker-Adaptive Training , 1997 .
[28] Chung-Hsien Wu,et al. Automatic generation of synthesis units and prosodic information for Chinese concatenative synthesis , 2001, Speech Commun..
[29] Chung-Hsien Wu,et al. Pronunciation variation generation for spontaneous speech synthesis using state-based voice transformation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[30] Yu Tsao,et al. A study on detection based automatic speech recognition , 2006, INTERSPEECH.
[31] Jr. G. Forney,et al. Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.
[32] Tomoki Toda,et al. Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[33] K. Tokuda,et al. A Training Method of Average Voice Model for HMM-Based Speech Synthesis , 2003, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..
[34] Tomoki Toda,et al. Post-Filters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis , 2016 .
[35] Cai Rui. TH-CoSS,a Mandarin Speech Corpus for TTS , 2007 .
[36] Chung-Hsien Wu,et al. Idiolect Extraction and Generation for Personalized Speaking Style Modeling , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[37] Rüdiger Hoffmann,et al. Toward spontaneous speech Synthesis-utilizing language model information in TTS , 2004, IEEE Transactions on Speech and Audio Processing.
[38] Ellen Eide. Distinctive features for use in an automatic speech recognition system , 2001, INTERSPEECH.
[39] Chiu-yu Tseng,et al. Mandarin spontaneous narrative planning - prosodic evidence from national taiwan university lecture corpus , 2009, INTERSPEECH.
[40] Chung-Hsien Wu,et al. Personalized Spectral and Prosody Conversion Using Frame-Based Codeword Distribution and Adaptive CRF , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[41] Tatsuya Kawahara,et al. Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[42] Junichi Yamagishi,et al. Utilising spontaneous conversational speech in HMM-based speech synthesis , 2010, SSW.
[43] Chung-Hsien Wu,et al. Hierarchical Prosody Conversion Using Regression-Based Clustering for Emotional Speech Synthesis , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[44] Takashi Nose,et al. Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis , 2014, Speech Commun..
[45] Nick Campbell,et al. Optimising selection of units from speech databases for concatenative synthesis , 1995, EUROSPEECH.
[46] Junichi Yamagishi,et al. Average-Voice-Based Speech Synthesis , 2006 .
[47] Peter Grünwald,et al. A tutorial introduction to the minimum description length principle , 2004, ArXiv.
[48] Keiichi Tokuda,et al. Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis , 2005, Systems and Computers in Japan.
[49] Kishore Prahallad,et al. Sub-Phonetic Modeling For Capturing Pronunciation Variations For Conversational Speech Synthesis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[50] Alan W. Black,et al. Optimizing segment label boundaries for statistical speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[51] Chung-Hsien Wu,et al. Fluent personalized speech synthesis with prosodic word-level spontaneous speech generation , 2015, INTERSPEECH.
[52] Chin-Hui Lee,et al. A penalized logistic regression approach to detection based phone classification , 2008, INTERSPEECH.
[53] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.