Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech
暂无分享,去创建一个
[1] Frank K. Soong,et al. A hierarchical F0 modeling method for HMM-based speech synthesis , 2010, INTERSPEECH.
[2] Elisabeth Selkirk,et al. Sentence Prosody: Intonation, Stress and Phrasing , 1996 .
[3] Takashi Nose,et al. Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis , 2014, Speech Commun..
[4] Heiga Zen,et al. Deep learning in speech synthesis , 2013, SSW.
[5] Keikichi Hirose,et al. Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis , 2005, Speech Commun..
[6] Michael Picheny,et al. The IBM expressive text-to-speech synthesis system for American English , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[7] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[8] M. Ortega-Llebaria,et al. DISENTANGLING STRESS FROM ACCENT IN SPANISH: PRODUCTION PATTERNS OF THE STRESS CONTRAST IN DEACCENTED SYLLABLES * , 2005 .
[9] Heiga Zen,et al. Context-dependent additive log f_0 model for HMM-based speech synthesis , 2009, INTERSPEECH.
[10] Takao Kobayashi,et al. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Junichi Yamagishi,et al. HMM-BASED EXPRESSIVE SPEECH SYNTHESIS — TOWARDS TTS WITH ARBITRARY SPEAKING STYLES AND EMOTIONS , 2003 .
[12] Heiga Zen,et al. Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis , 2011, Speech Commun..
[13] Ya Li,et al. Hierarchical Stress Modeling in Mandarin Text-to-Speech , 2011, INTERSPEECH.
[14] Jianfen Cao,et al. On neutral-tone syllables in Mandarin Chinese , 1992 .
[15] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Julia Hirschberg,et al. Detecting pitch accent using pitch-corrected energy-based predictors , 2007, INTERSPEECH.
[17] Junichi Yamagishi,et al. Glottal Source and Prosodic Prominence Modelling in HMM-based Speech Synthesis for the Blizzard Challenge 2009 , 2009 .
[18] Aijun Li,et al. Prosody conversion from neutral speech to emotional speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Y Xu,et al. Production and perception of coarticulated tones. , 1994, The Journal of the Acoustical Society of America.
[20] Hiroya Fujisaki,et al. Dynamic Characteristics of Voice Fundamental Frequency in Speech and Singing , 1983 .
[21] Xu Jiepin. The influence of Chinese sentence stress on pitch and duration , 2000 .
[22] Julia Hirschberg,et al. Pitch Accent in Context: Predicting Intonational Prominence from Text , 1993, Artif. Intell..
[23] Ting,et al. Study on automatic prediction of sentential stress for Chinese Putonghua Text-to-Speech system with natural style , 2007 .
[24] Gao Peng Chen,et al. Quantitative Analysis and Synthesis of Focus in Mandarin , 2004 .
[25] Yasemin Altun,et al. Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech , 2004, ACL.
[26] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[27] Junichi Yamagishi,et al. Identification of contrast and its emphatic realization in HMM based speech synthesis , 2009, INTERSPEECH.
[28] Keikichi Hirose,et al. Prosodic focus control in reply speech generation for a spoken dialogue system of information retrieval , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..
[29] Mari Ostendorf,et al. Automatic labeling of prosodic patterns , 1994, IEEE Trans. Speech Audio Process..
[30] Ming Lei,et al. Investigation of prosodie FO layers in hierarchical FO modeling for HMM-based speech synthesis , 2010, IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS.
[31] Simon King,et al. Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech , 2010, Speech Commun..
[32] Takao Kobayashi,et al. Modeling of various speaking styles and emotions for HMM-based speech synthesis , 2003, INTERSPEECH.
[33] Takao Kobayashi,et al. Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing , 2005, IEICE Trans. Inf. Syst..
[34] Keikichi Hirose,et al. Hierarchical stress generation with Fujisaki model in expressive speech synthesis , 2014 .
[35] Julia Hirschberg,et al. Detecting Pitch Accent Using Pitch-corrected Energy-based Predictors , 2007 .
[36] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[37] Marc Schröder,et al. Expressive Speech Synthesis: Past, Present, and Possible Futures , 2009, Affective Information Processing.
[38] Hiroya Fujisaki,et al. Information, prosody, and modeling - with emphasis on tonal features of speech - , 2004, Speech Prosody 2004.
[39] Ya Li,et al. The Stability Analysis of Disyllabic Stress in Mandarin Speech , 2011, ICPhS.
[40] Shrikanth S. Narayanan,et al. Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[41] Xuejing Sun,et al. Pitch accent prediction using ensemble machine learning , 2002, INTERSPEECH.
[42] N. Campbell,et al. Conversational speech synthesis and the need for some laughter , 2005, IEEE Transactions on Audio, Speech, and Language Processing.
[43] Bo Xu,et al. From English pitch accent detection to Mandarin stress detection, where is the difference? , 2012, Comput. Speech Lang..
[44] Xiaoying Xu,et al. Influence of rhythm and tone pattern on Mandarin stress perception in continuous speech , 2011 .
[45] 趙 元任,et al. A grammar of spoken Chinese = 中國話的文法 , 1968 .
[46] Anne Cutler,et al. Stress and accent in language production and understanding , 1984 .
[47] Keikichi Hirose,et al. Control of prosodic focus in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[48] Meng Zhang,et al. Text-based unstressed syllable prediction in Mandarin , 2010, INTERSPEECH.
[49] Simon King,et al. Modelling prominence and emphasis improves unit-selection synthesis , 2007, INTERSPEECH.
[50] Zhu Weibin. A Chinese Speech Synthesis System with Capability of Accent Realizing , 2007 .
[51] Nick Campbell,et al. Speech Database Design for a Concatenative Text-to-Speech Synthesis System for Individuals with Communication Disorders , 2003, Int. J. Speech Technol..
[52] Frank K. Soong,et al. Generating natural F0 trajectory with additive trees , 2008, INTERSPEECH.
[53] Takao Kobayashi,et al. A style control technique for HMM-based speech synthesis , 2004, INTERSPEECH.
[54] Lianhong Cai,et al. Modeling Prosody Pattern of Chinese Expressive Speech and Its Application in Personalized Speech Conversion , 2012 .
[55] Kai Yu,et al. Word-level emphasis modelling in HMM-based speech synthesis , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[56] Dirk Heylen,et al. Generating expressive speech for storytelling applications , 2006, IEEE Transactions on Audio, Speech, and Language Processing.