Study on Length Distribution of Prosodic Phrase and its Function in Prosodic Structure Prediction

Prosodic phrase prediction is an essential module of speech synthesis system, whose accuracy directly influences the naturalness of synthetic speech. The most current prosodic phrase prediction methods start with lexical or shallow syntactic information. However, not only is the segmentation of prosodic phrases in natural speech constrained by grammatical structure, but also its length distribution law plays an important role. In this paper, a statistical analysis of prosodic phrase length distribution is conducted with syllable and foot as units, and three kinds of prosodic phrase length models are put forward on the basis of maximum entropy model, respectively as the recursive model based on rules, local optimum model based on probability and the global optimum model based on probability. Experimental results show that the length model can significantly improve the prediction accuracy of prosodic phrase and natural speech tends to pause with long and short intervals, alternating with each other, while the prosodic phrase planning is a kind of short-time local planning.