Hidden Markov model based voice conversion using dynamic characteristics of speaker

This paper proposes a new voice conversion technique based on hidden Markov model (HMM) for modeling of speaker’s dynamic characteristics. The basic idea of this technique is to use state transition probability as speaker’s dynamic characteristics and have conversion rule at each state of HMM. A couple of methods is developed for creating state-dependent conversion rule. One uses source speaker’s spectral dynamics and the other uses target speaker’s. The experimental results showed that the proposed methods have better performance than conventional VQ-method in both objective and subjective tests. The comparison of our two methods showed that the method using target speaker’s dynamics is superior in listening test and produces more natural sound.

[1]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[2]  Yung-Hwan Oh,et al.  Performance improvement of speaker recognition system for small training data , 1994, ICSLP.

[3]  Satoshi Nakamura,et al.  Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[4]  Michael Savic,et al.  Voice personality transformation , 1991, Digit. Signal Process..

[5]  Yoshinori Sagisaka,et al.  Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks , 1995, Speech Commun..

[6]  Yannis Stylianou,et al.  On the transformation of the speech spectrum for voice conversion , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.