Simultaneous modeling of phonetic and prosodic parameters,and characteristic conversion for HMM-based text-to-speech systems

[1]  Naofumi Aoki,et al.  Development of a rule-based speech synthesis system for the Japanese language using a MELP vocoder , 2000, 2000 10th European Signal Processing Conference.

[2]  Keiichi Tokuda,et al.  Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Alex Acero,et al.  Formant analysis and synthesis using hidden Markov models , 1999, EUROSPEECH.

[4]  Wu Chou,et al.  Decision tree state tying based on penalized Bayesian information criterion , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  Keiichi Tokuda,et al.  Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6]  Yoshinori Sagisaka,et al.  Automatic generation of multiple pronunciations based on neural networks , 1999, Speech Commun..

[7]  Keiichi Tokuda,et al.  Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.

[8]  Robert E. Donovan,et al.  The IBM trainable speech synthesis system , 1998, ICSLP.

[9]  Keiichi Tokuda,et al.  Duration modeling for HMM-based speech synthesis , 1998, ICSLP.

[10]  Keiichi Tokuda,et al.  Speaker adaptation for HMM-based speech synthesis system using MLLR , 1998, SSW.

[11]  Mark J. F. Gales,et al.  A comparative study of methods for phonetic decision-tree state clustering , 1997, EUROSPEECH.

[12]  Keiichi Tokuda,et al.  Speaker interpolation in HMM-based speech synthesis system , 1997, EUROSPEECH.

[13]  Hideki Kawahara,et al.  Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Alex Acero,et al.  Recent improvements on Microsoft's trainable text-to-speech system-Whistler , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Keiichi Tokuda,et al.  Voice characteristics conversion for HMM-based speech synthesis system , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  K. Koishida,et al.  Vector quantization of speech spectral parameters using statistics of dynamic features , 1997 .

[17]  Keiichi Tokuda,et al.  Speech synthesis using HMMs with dynamic features , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[18]  Koichi Shinoda,et al.  Speaker adaptation with autonomous model complexity control by MDL principle , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[19]  T. Barnwell,et al.  A mixed excitation LPC vocoder model for low bit rate speech coding , 1995, IEEE Trans. Speech Audio Process..

[20]  Philip C. Woodland,et al.  Automatic speech synthesiser parameter estimation using HMMs , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[21]  K. Tokuda,et al.  Speech parameter generation from HMM using dynamic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[22]  Yoshinori Sagisaka,et al.  Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks , 1995, Speech Commun..

[23]  Keiichi Tokuda,et al.  An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features , 1995, EUROSPEECH.

[24]  Paul Dalsgaard,et al.  Modelling intonation contours at the phrase level using continuous density hidden Markov models , 1994, Comput. Speech Lang..

[25]  Mari Ostendorf,et al.  A dynamical system model for generating F0 for synthesis , 1994, SSW.

[26]  Andrej Ljolje,et al.  Automatic speech segmentation for concatenative inventory selection , 1994, SSW.

[27]  Keiichi Tokuda,et al.  An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[28]  Piero Pierucci,et al.  Phonetic ergodic HMM for speech synthesis , 1991, EUROSPEECH.

[29]  Takao Kobayashi,et al.  Complex Chebyshev approximation for IIR digital filters using an iterative WLS technique , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[30]  Massimo Giustiniani,et al.  A hidden Markov model approach to speech synthesis , 1989, EUROSPEECH.

[31]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[32]  Frank Fallside,et al.  Lexical stress recognition using hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[33]  Masafumi Nishimura,et al.  HMM-Based speech recognition using multi-dimensional multi-labeling , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[34]  Stephen E. Levinson,et al.  Continuously variable duration hidden Markov models for speech analysis , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35]  Saito,et al.  Fundamentals of Speech Signal Processing , 1986 .

[36]  B.-H. Juang,et al.  Maximum-likelihood estimation for mixture multivariate stochastic observations of Markov chains , 1985, AT&T Technical Journal.

[37]  Chikio Hayashi RECENT THEORETICAL AND METHODOLOGICAL DEVELOPMENTS IN MULTIDIMENSIONAL SCALING AND ITS RELATED METHODS IN JAPAN , 1985 .

[38]  Satoshi Imai,et al.  Cepstral analysis synthesis on the mel frequency scale , 1983, ICASSP.

[39]  Louis A. Liporace,et al.  Maximum likelihood estimation for multivariate observations of Markov sources , 1982, IEEE Trans. Inf. Theory.

[40]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .