A multispeaker analysis of durations in read French paragraphs.

Understanding how the durations of acoustic segments vary in natural language can lead to more intelligible synthetic speech, and to improved automatic recognition. Toward this goal, a 111-word French paragraph was read by 29 native speakers from France. Measured durations of acoustic segments were significantly shorter than those in earlier studies of stressed words in French sentences read from a list. Previously recognized trends (short schwa vowels and function words; long unvoiced fricatives, nasalized vowels, and prepausal syllables) are confirmed and quantitative results are given. Vowels were longer preceding voiced fricatives (but not prior to/r/), and were also longer at sentence-internal pauses than at the end of a sentence. Standard deviations of acoustic segment durations (at fixed positions in the paragraph) across speakers averaged less than 25% in most cases. The exceptional, larger deviations occurred primarily in segments adjacent to pauses. Speaking rate variations could account for only one-sixth of the deviations, the rest being attributable to relatively free variation across speakers. A generative model of French durations, suitable for synthesis-by-rule, is presented, and applications to automatic recognition are discussed.