Prosodic factors for predicting local pitch shape

In this paper, we investigate the predictive power of different prosodic factorization schemes with respect to pitch movement. We use this to propose an extension of a standard diphone database with diphones that have been recorded in different prosodic contexts. The goal of this research is to reduce the amount of pitch modification required, thereby improving the segmental quality of the synthetic voice. We present a factorization scheme based on the foot structure of utterances and show that this efficient scheme results in a fairly small number of additional diphones that need to be recorded.