Automatic extraction of model parameters from fundamental frequency contours of English utterances

The generation process model of the fundamental frequency contours (F0 contours) of speech is known to be capable of generating F0 contours quite close to observed ones. The extraction of model parameters from an observed contour, however, requires an iterative process starting from a set of initial parameter values. In order to guarantee a rapid convergence to an optimum solution, the values should be appropriate ones. We already have developed a method of automatically extracting these from given F0 contours, and applied it to Japanese sentences with good results. The method is based on approximating an observed contour by a continuous curve differentiable everywhere. In the present paper, it was applied to English utterances. Experiments were conducted for 4 native speakers’ utterances with 14.5% and 17.5% of average miss and false alarm rates for the accent commands, and 35.7% and 15.5% for the phrase commands.