Cross-language voice conversion

First, the part of spectral difference that is due to the difference in language is assessed. This is investigated using a bilingual speaker's speech data. It is found that the interlanguage (between English and Japanese) difference is smaller than the interspeaker difference. Listening tests indicate that the difference between English and Japanese is very small. Second, a model for cross-language voice conversion is described. In this approach, voice conversion is considered a mapping problem between two speakers' spectrum spaces. The spectrum spaces are represented by codebooks. From this point of view, a cross-language voice conversion model and measures for the model are proposed. The converted speech from male to female is as understandable as the unconverted speech and, moreover, it is recognized as female speech.<<ETX>>

[1]  Rolf Carlson,et al.  MITalk‐79: The 1979 MIT text‐to‐speech system , 1979 .

[2]  Satoshi Nakamura,et al.  Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[3]  G. Rigoll Speaker adaptation for large vocabulary speech recognition systems using speaker Markov models , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..