Vowel quality in spontaneous speech: what makes a good vowel?

Clear speech is characterised by longer segmental durations and less target undershoot [9] which results in more extreme spectral features. This paper deals with the clarity of vowels produced in spontaneous speech in a large corpus of task-oriented dialogues. We present an automatic technique for measuring vowel clarity on the basis of a vowel’s spectral characteristics. This technique was evaluated using a perceptual test. Subjects rated the ’goodness’ of vowels with different spectral characteristics with controlled duration and amplitude and these results were compared with an automatic rating. Results indicated that although agreement between subjects and the automatic measurement was poor it was as poor as the agreement between subjects. On the basis of these results we address the following questions: 1. Can subjects reliably judge the clarity of vowels excerpted from spontaneous speech without duration cues? 2. Can a statistical model [3] reliably predict the subjects’ response to such vowels?