A novel hybrid approach for Mandarin speech synthesis

The paper investigates a new method to solve concatenation problems of Mandarin speech synthesis which is based on the hybrid approach of HMM-based speech synthesis and unit selection. Unlike other works which use only boundary F0 errors as concatenation cost, a CART based F0 dependency model which considers much context information is trained to measure smoothness of F0. Instead of phoneme-sized units, the basic units of our HUS system are syllables, which has been proved to be better for the prosody stability in Mandarin. The experiments show that the proposed method achieves better performance than conventional hybrid system and unit selection system. Index Terms: Speech synthesis, hidden Markov model, unit selection, hybrid