Voice Conversion : State-ofthe-Art and Future Work

Introduction Voice conversion is the adaptation of the characteristics of a source speaker’s voice to those of a target speaker. Over the last few years, the interest in voice conversion has risen significantly. This is due to its application to the individualization of text-to-speech systems, whose voices, in general, have to be created in a rather time-consuming way requiring human assistance. In this paper, the most popular applications and solution approaches are itemized. We will see that some applications require text-independent or even cross-language voice conversion. Then, evaluation methods are discussed and, finally, the author’s future work is outlined.

[1]  H. Ney,et al.  VTLN-based cross-language voice conversion , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[2]  Hermann Ney,et al.  A first step towards text-independent voice conversion , 2004, INTERSPEECH.

[3]  Tomoki Toda,et al.  Evaluation of cross-language voice conversion based on GMM and straight , 2001, INTERSPEECH.

[4]  Satoshi Nakamura,et al.  Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  Athanasios Mouchtaris,et al.  Non-parallel training for voice conversion by maximum likelihood constrained adaptation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Saeed Vaseghi,et al.  Evaluation of methods for parameteric formant transformation in voice conversion , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7]  Levent M. Arslan,et al.  Subband based voice conversion , 2002, INTERSPEECH.

[8]  Levent M. Arslan,et al.  Speaker Transformation Algorithm using Segmental Codebooks (STASC) , 1999, Speech Commun..

[9]  Hui Ye,et al.  Voice conversion for unknown speakers , 2004, INTERSPEECH.

[10]  Hermann Ney,et al.  A study on residual prediction techniques for voice conversion , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  Alexander Kain,et al.  High-resolution voice transformation , 2001 .