FlexVoice: A Parametric Approach to High-Quality Speech Synthesis

FlexVoice, an integrated text-to-speech (TTS) system is presented in this paper. Its most distinctive feature is its low memory and CPU load while preserving the high quality of leading TTS systems. FlexVoice uses a hybrid approach that combines diphone concatenationwith LPC-based parametric synthesis. Major improvements of speech quality are achieved by the careful design of each module at all synthesis levels (such as selection of training data for the various machine learning methods and that of the basic synthesis units for the parametric synthesiser). FlexVoice currently supports US English with two male and two female voices.

[1]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.