Advanced LPC techniques of voice regeneration for “Virtual Dubbing”

Some recent voice conversion techniques consider models that make use of well-known paradigms of signal processing, such as Linear Predictive Coding and spectral modelling. We propose a voice converter based on Linear Predictive Coding, in which properly trained Gaussian Mixture Models transform the encoder coefficients, accounting for the glottal characteristics of a source voice, into new coefficients which provide the decoder with information about the glottal characteristics of a target voice. This voice conversion procedure results in a filter block diagram suitable for real time implementation, whose parameters can be accommodated depending on the performances of the DSP hardware at hand. A Simulink model of the voice converter that can be directly translated into DSP code is presented. Listening experiments are shown, reporting that both non-expert and expert subjects rated the voice converter positively.

[1]  Alexander Kain,et al.  Personalizing a speech synthesizer by voice adaptation , 1998, SSW.

[2]  Alexander Kain,et al.  Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Julyan H. E. Cartwright,et al.  Nonlinear Dynamics of the Perceived Pitch of Complex Sounds , 1999, chao-dyn/9907002.

[4]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[5]  Yoshinori Sagisaka,et al.  Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks , 1995, Speech Commun..

[6]  Manfred R. Schroeder,et al.  Linear predictive coding of speech: Review and current directions , 1985, IEEE Communications Magazine.

[7]  Alexander Kain,et al.  Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Yannis Stylianou,et al.  Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..

[9]  Eric Moulines,et al.  Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..

[10]  Thierry Dutoit,et al.  On the use of a hybrid harmonic/stochastic model for TTS synthesis-by-concatenation , 1996, Speech Commun..

[11]  Carlo Drioli Radial Basis Function Networks for Conversion of Sound Spectra , 2001, EURASIP J. Adv. Signal Process..