Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm
暂无分享,去创建一个
Takao Kobayashi | Junichi Yamagishi | Katsumi Ogata | Yuji Nakano | Juri Isogai | J. Yamagishi | Takao Kobayashi | Katsumi Ogata | Yuji Nakano | Juri Isogai | T. Kobayashi | Takao Kobayashi
[1] Stephen E. Levinson,et al. Continuously variable duration hidden Markov models for automatic speech recognition , 1986 .
[2] Ramesh A. Gopinath,et al. Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[3] Mark J. F. Gales,et al. Mean and variance adaptation within the MLLR framework , 1996, Comput. Speech Lang..
[4] Shigeru Katagiri,et al. A large-scale Japanese speech database , 1990, ICSLP.
[5] Keiichi Tokuda,et al. Voice characteristics conversion for HMM-based speech synthesis system , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[6] K. Tokuda,et al. Speech parameter generation from HMM using dynamic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[7] Takao Kobayashi,et al. Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis , 2005, IEICE Trans. Inf. Syst..
[8] Takao Kobayashi,et al. Model Adaptation Approach to Speech Synthesis with Diverse Voices and Styles , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[9] Keiichi Tokuda,et al. Speech synthesis using HMMs with dynamic features , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[10] Masatsune Tamura,et al. A Context Clustering Technique for Average Voice Models , 2003 .
[11] Takao Kobayashi,et al. A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features , 2006, IEICE Trans. Inf. Syst..
[12] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[13] Takao Kobayashi,et al. Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis , 2006, INTERSPEECH.
[14] Chin-Hui Lee,et al. A structural Bayes approach to speaker adaptation , 2001, IEEE Trans. Speech Audio Process..
[15] Takao Kobayashi,et al. Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training , 2007, IEICE Trans. Inf. Syst..
[16] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..
[17] Nick Campbell,et al. Optimising selection of units from speech databases for concatenative synthesis , 1995, EUROSPEECH.
[18] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[19] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[20] Takao Kobayashi,et al. Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis , 2005, INTERSPEECH.
[21] Takao Kobayashi,et al. Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis , 2006, INTERSPEECH.
[22] Arjun K. Gupta,et al. Elliptically contoured models in statistics , 1993 .
[23] Jun-ichi Takahashi,et al. Vector-field-smoothed Bayesian learning for fast and incremental speaker/telephone-channel adaptation , 1997, Comput. Speech Lang..
[24] Vassilios Digalakis,et al. Speaker adaptation using constrained estimation of Gaussian mixtures , 1995, IEEE Trans. Speech Audio Process..
[25] Jen-Tzung Chien,et al. Improved Bayesian learning of hidden Markov models for speaker adaptation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[26] Matthew J. Makashay,et al. Corpus-based techniques in the AT&t nextgen synthesis system , 2000, INTERSPEECH.
[27] Vassilios Digalakis,et al. Speaker adaptation using combined transformation and Bayesian methods , 1996, IEEE Trans. Speech Audio Process..
[28] Heiga Zen,et al. Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV , 2007, SSW.
[29] Alan W. Black. Unit selection and emotional speech , 2003, INTERSPEECH.
[30] Keiichi Tokuda,et al. An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[31] R. Moore,et al. Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[32] Chin-Hui Lee,et al. Structural maximum a posteriori linear regression for fast HMM adaptation , 2002, Comput. Speech Lang..
[33] Mark J. F. Gales,et al. Multiple-cluster adaptive training schemes , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[34] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[35] Takashi Nose,et al. A Style Control Technique for HMM-Based Expressive Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[36] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[37] Keiichi Tokuda,et al. Multi-Space Probability Distribution HMM , 2002 .
[38] Koichi Shinoda,et al. MDL-based context-dependent subword modeling for speech recognition , 2000 .
[39] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[40] Takao Kobayashi,et al. HSMM-Based Model Adaptation Algorithms for Average-Voice-Based Speech Synthesis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[41] Keiichi Tokuda,et al. Speaker adaptation for HMM-based speech synthesis system using MLLR , 1998, SSW.
[42] Richard M. Schwartz,et al. A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[43] Tetsuo Kosaka,et al. Speaker adaptation based on transfer vector field smoothing using maximum a posteriori probability estimation , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[44] Biing-Hwang Juang,et al. Signal bias removal by maximum likelihood estimation for robust telephone speech recognition , 1996, IEEE Trans. Speech Audio Process..
[45] Philip C. Woodland,et al. A hidden Markov-model-based trainable speech synthesizer , 1999, Comput. Speech Lang..
[46] K. Tokuda,et al. A Training Method of Average Voice Model for HMM-Based Speech Synthesis , 2003, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..
[47] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[48] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..
[49] Keiichi Tokuda,et al. Speaker adaptation of pitch and spectrum for HMM-based speech synthesis , 2002, Systems and Computers in Japan.
[50] Keiichi Tokuda,et al. Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[51] Takao Kobayashi,et al. Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing , 2005, IEICE Trans. Inf. Syst..
[52] Satoshi Imai,et al. Cepstral analysis synthesis on the mel frequency scale , 1983, ICASSP.
[53] Heiga Zen,et al. Hidden Semi-Markov Model Based Speech Synthesis System , 2006 .
[54] Takao Kobayashi,et al. Robust F0 Estimation of Speech Signal Using Harmonicity Measure Based on Instantaneous Frequency , 2004, IEICE Trans. Inf. Syst..
[55] Heiga Zen,et al. The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006 , 2006, IEICE Trans. Inf. Syst..
[56] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[57] Sadaoki Furui,et al. New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer , 2006, Speech Commun..
[58] Ren-Hua Wang,et al. HMM-Based Emotional Speech Synthesis Using Average Emotion Model , 2006, ISCSLP.