Candidate Expansion and Prosody Adjustment for Natural Speech Synthesis Using a Small Corpus
暂无分享,去创建一个
Chung-Hsien Wu | Yan-You Chen | Jhing-Fa Wang | Yi-Chin Huang | Shih-Lun Lin | Chung-Hsien Wu | Yan-You Chen | Jhing-Fa Wang | Yi-Chin Huang | Shih-Lun Lin
[1] Simon King,et al. Detection of phonological features in continuous speech using neural networks , 2000, Comput. Speech Lang..
[2] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[3] Chung-Hsien Wu,et al. Polyglot Speech Synthesis Based on Cross-Lingual Frame Selection Using Auditory and Articulatory Features , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] Mari Ostendorf,et al. Joint prosody prediction and unit selection for concatenative speech synthesis , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[5] Ren-Hua Wang,et al. HMM-Based Hierarchical Unit Selection Combining Kullback-Leibler Divergence with Likelihood Criterion , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[6] Chung-Hsien Wu,et al. Automatic generation of synthesis units and prosodic information for Chinese concatenative synthesis , 2001, Speech Commun..
[7] Yu Tsao,et al. A study on detection based automatic speech recognition , 2006, INTERSPEECH.
[8] Chung-Hsien Wu,et al. Natural speech synthesis based on hybrid approach with candidate expansion and verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Sin-Horng Chen,et al. Vector quantization of pitch information in Mandarin speech , 1990, IEEE Trans. Commun..
[10] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[11] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[12] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[13] Eduardo Rodríguez Banga,et al. A method for combining intonation modelling and speech unit selection in corpus-based speech synthesis systems , 2006, Speech Commun..
[14] Gernot A. Fink,et al. Combining acoustic and articulatory feature information for robust speech recognition , 2002, Speech Commun..
[15] Ren-Hua Wang,et al. The USTC System for Blizzard Challenge 2010 , 2008 .
[16] Chung-Hsien Wu,et al. Variable-Length Unit Selection in TTS Using Structural Syntactic Cost , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Chin-Hui Lee,et al. A penalized logistic regression approach to detection based phone classification , 2008, INTERSPEECH.
[18] Chung-Hsien Wu,et al. Exploiting Prosody Hierarchy and Dynamic Features for Pitch Modeling and Generation in HMM-Based Speech Synthesis , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Mari Ostendorf,et al. Moving beyond the 'beads-on-a-string' model of speech , 1999 .
[20] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[21] Chin-Hui Lee,et al. Toward a detector-based universal phone recognizer , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Chung-Hsien Wu,et al. Hierarchical Prosody Conversion Using Regression-Based Clustering for Emotional Speech Synthesis , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[23] Tanja Schultz,et al. Multilingual articulatory features , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[24] Thierry Dutoit,et al. Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[25] Chung-Hsien Wu,et al. Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[26] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[27] H. Zen,et al. An HMM-based speech synthesis system applied to English , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..
[28] Nick Campbell,et al. Optimising selection of units from speech databases for concatenative synthesis , 1995, EUROSPEECH.
[29] Sin-Horng Chen,et al. An RNN-based prosodic information synthesizer for Mandarin text-to-speech , 1998, IEEE Trans. Speech Audio Process..
[30] Cenk Demiroglu,et al. Analysis of speaker similarity in the statistical speech synthesis systems using a hybrid approach , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).
[31] Cai Rui. TH-CoSS,a Mandarin Speech Corpus for TTS , 2007 .
[32] David Malah,et al. A Hybrid Text-to-Speech System That Combines Concatenative and Statistical Synthesis Units , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[33] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[34] Keiichi Tokuda,et al. An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[35] Chung-Hsien Wu,et al. Phone set construction based on context-sensitive articulatory attributes for code-switching speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[37] Shinsuke Sakai,et al. A probabilistic approach to unit selection for corpus-based speech synthesis , 2005, INTERSPEECH.
[38] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .
[39] Inma Hernáez,et al. A Hybrid TTS Approach for Prosody and Acoustic Modules , 2011, INTERSPEECH.
[40] Vincent Pollet,et al. Synthesis by generation and concatenation of multiform segments , 2008, INTERSPEECH.
[41] Chung-Hsien Wu,et al. Personalized Spectral and Prosody Conversion Using Frame-Based Codeword Distribution and Adaptive CRF , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[42] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[43] Chung-Hsien Wu,et al. Residual compensation based on articulatory feature-based phone clustering for hybrid Mandarin speech synthesis , 2013, SSW.
[44] P. Hoole,et al. Tone-Vowel Interaction in Standard Chinese , 2004 .