论文信息 - Simultaneous modeling of phonetic and prosodic parameters,and characteristic conversion for HMM-based text-to-speech systems - 字舞流文

Simultaneous modeling of phonetic and prosodic parameters,and characteristic conversion for HMM-based text-to-speech systems

吉村貴克 | 吉村貴克

[1] Naofumi Aoki,et al. Development of a rule-based speech synthesis system for the Japanese language using a MELP vocoder , 2000, 2000 10th European Signal Processing Conference.

[2] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3] Alex Acero,et al. Formant analysis and synthesis using hidden Markov models , 1999, EUROSPEECH.

[4] Wu Chou,et al. Decision tree state tying based on penalized Bayesian information criterion , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5] Keiichi Tokuda,et al. Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6] Yoshinori Sagisaka,et al. Automatic generation of multiple pronunciations based on neural networks , 1999, Speech Commun..

[7] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.

[8] Robert E. Donovan,et al. The IBM trainable speech synthesis system , 1998, ICSLP.

[9] Keiichi Tokuda,et al. Duration modeling for HMM-based speech synthesis , 1998, ICSLP.

[10] Keiichi Tokuda,et al. Speaker adaptation for HMM-based speech synthesis system using MLLR , 1998, SSW.

[11] Mark J. F. Gales,et al. A comparative study of methods for phonetic decision-tree state clustering , 1997, EUROSPEECH.

[12] Keiichi Tokuda,et al. Speaker interpolation in HMM-based speech synthesis system , 1997, EUROSPEECH.

[13] Hideki Kawahara,et al. Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14] Alex Acero,et al. Recent improvements on Microsoft's trainable text-to-speech system-Whistler , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15] Keiichi Tokuda,et al. Voice characteristics conversion for HMM-based speech synthesis system , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16] K. Koishida,et al. Vector quantization of speech spectral parameters using statistics of dynamic features , 1997 .

[17] Keiichi Tokuda,et al. Speech synthesis using HMMs with dynamic features , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[18] Koichi Shinoda,et al. Speaker adaptation with autonomous model complexity control by MDL principle , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[19] T. Barnwell,et al. A mixed excitation LPC vocoder model for low bit rate speech coding , 1995, IEEE Trans. Speech Audio Process..

[20] Philip C. Woodland,et al. Automatic speech synthesiser parameter estimation using HMMs , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[21] K. Tokuda,et al. Speech parameter generation from HMM using dynamic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[22] Yoshinori Sagisaka,et al. Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks , 1995, Speech Commun..

[23] Keiichi Tokuda,et al. An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features , 1995, EUROSPEECH.

[24] Paul Dalsgaard,et al. Modelling intonation contours at the phrase level using continuous density hidden Markov models , 1994, Comput. Speech Lang..

[25] Mari Ostendorf,et al. A dynamical system model for generating F0 for synthesis , 1994, SSW.

[26] Andrej Ljolje,et al. Automatic speech segmentation for concatenative inventory selection , 1994, SSW.

[27] Keiichi Tokuda,et al. An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[28] Piero Pierucci,et al. Phonetic ergodic HMM for speech synthesis , 1991, EUROSPEECH.

[29] Takao Kobayashi,et al. Complex Chebyshev approximation for IIR digital filters using an iterative WLS technique , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[30] Massimo Giustiniani,et al. A hidden Markov model approach to speech synthesis , 1989, EUROSPEECH.

[31] Alan V. Oppenheim,et al. Discrete-Time Signal Pro-cessing , 1989 .

[32] Frank Fallside,et al. Lexical stress recognition using hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[33] Masafumi Nishimura,et al. HMM-Based speech recognition using multi-dimensional multi-labeling , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[34] Stephen E. Levinson,et al. Continuously variable duration hidden Markov models for speech analysis , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35] Saito,et al. Fundamentals of Speech Signal Processing , 1986 .

[36] B.-H. Juang,et al. Maximum-likelihood estimation for mixture multivariate stochastic observations of Markov chains , 1985, AT&T Technical Journal.

[37] Chikio Hayashi. RECENT THEORETICAL AND METHODOLOGICAL DEVELOPMENTS IN MULTIDIMENSIONAL SCALING AND ITS RELATED METHODS IN JAPAN , 1985 .

[38] Satoshi Imai,et al. Cepstral analysis synthesis on the mel frequency scale , 1983, ICASSP.

[39] Louis A. Liporace,et al. Maximum likelihood estimation for multivariate observations of Markov sources , 1982, IEEE Trans. Inf. Theory.

[40] J. Flanagan. Speech Analysis, Synthesis and Perception , 1971 .