A SENTENCE-PITCH-CONTOUR GENERATION METHOD USING VQ/HMM FOR MANDARIN TEXT-TO-SPEECH

In this paper, a method with sentence-wide optimization consideration is proposed to generate a Mandarin sentence's pitch-contour. The developed model is called the sentence pitch-contour HMM (SPC-HMM) due to its use of VQ (vector quantization) and HMM (hidden Markov model). To construct an SPC-HMM, the pitch-contours of the syllables from each training sentence are normalized on both time and pitch-height first. The method for pitch-height normalization is effective and newly developed here. After normalization, the pitch-contour of each training syllable is vector quantized. Then, the quantization code and lexical tones of adjacent syllables are combined to define the observation symbol sequences for HMM training. In the synthesis phase, when given a sentence and its relevant text-analysis information, the most probable observation sequence is generated by finding the sentence-wide largest probability path with a dynamic-programming based algorithm. We had conducted practical perception tests. It is found that the speech synthesized by using the sentence pitch-contour generated from out method is slightly better than uttered by an ordinary speaker. Besides, the comprehensibility of the synthesized speech is also promoted.