Applying improved spectral modeling for High Quality voice conversion

In this work, accurate spectral envelope estimation is applied to Voice Conversion in order to achieve High-Quality timbre conversion. True-Envelope based estimators allow model order selection leading to an adaptation of the spectral features to the characteristics of the speaker. Optimal residual signals can also be computed following a local adaptation of the model order in terms of the F0. A new perceptual criteria is proposed to measure the impact of the spectral conversion error. The proposed envelope models show improved spectral conversion performance as well as increased converted-speech quality when compared to Linear Prediction.

[1]  Tomoki Toda,et al.  Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Axel Röbel,et al.  Improving Lpc Spectral Envelope Extraction Of Voiced Speech By True-Envelope Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Axel Röbel,et al.  On cepstral and all-pole based spectral envelope modeling with unknown model order , 2007, Pattern Recognit. Lett..

[4]  Eric Moulines,et al.  Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..

[5]  Alexander Kain,et al.  Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  Hynek Hermansky,et al.  Spectral envelope sampling and interpolation in linear predictive analysis of speech , 1984, ICASSP.

[7]  Hugo Fastl,et al.  Psychoacoustics Facts and Models. 2nd updated edition , 1999 .

[8]  Amro El-Jaroudi,et al.  Discrete all-pole modeling , 1991, IEEE Trans. Signal Process..

[9]  Axel Röbel,et al.  All-Pole Spectral Envelope Modelling with Order Selection for Harmonic Signals , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).