Automatic detection of prosodic boundaries in speech

Abstract This paper describes a method for automatic annotation of prosodic events in speech, using segmental duration information. It details a way of differentiating prominence-related lengthening from boundary-related lengthening, using durational clues alone, and discusses an anomaly in the phrasing characteristics of four speakers' readings of 200 phonetically-balanced sentences. An algorithm is described that uses syllable-level differences in normalised segmental duration measures to detect prosodic boundaries in a speech signal. Tests with read-speech data from four British-English RP speakers show high agreement between speakers with respect to the number of boundaries detected and the length of the phrases delimited by each pair of boundaries, but the correlation between speakers on actual boundary locations is low. There is particular disagreement between speakers in the case of a single function word linking two groups of content words. This discrepancy can be resolved if the boundary is taken to be at the function word location itself, rather than at one or other side of the word. These results are taken to indicate some freedom in the placement of prosodic boundaries in such cases, sometimes being cued by a syntactic boundary, and sometimes by a rhythmic one.