Determining the relevance of different aspects of formant contours to intelligibility

Previous studies have shown that "clear" speech, where the speaker intentionally tries to enunciate, has better intelligibility than "conversational" speech, which is produced in regular conversation. However, conversational and clear speech vary along a number of acoustic dimensions and it is unclear what aspects of clear speech lead to better intelligibility. Previously, Kain et al. [J. Acoust. Soc. Am. 124 (4), 2308-2319 (2008)] showed that a combination of short-term spectra and duration was responsible for the improved intelligibility of one speaker. This study investigates subsets of specific features of short-term spectra including temporal aspects. Similar to Kain's study, hybrid stimuli were synthesized with a combination of features from clear speech and complementary features from conversational speech to determine which acoustic features cause the improved intelligibility of clear speech. Our results indicate that, although steady-state formant values of tense vowels contributed to the intelligibility of clear speech, neither the steady-state portion nor the formant transition was sufficient to yield comparable intelligibility to that of clear speech. In contrast, when the entire formant contour of conversational speech including the phoneme duration was replaced by that of clear speech, intelligibility was comparable to that of clear speech. It indicated that the combination of formant contour and duration information was relevant to the improved intelligibility of clear speech. The study provides a better understanding of the relevance of different aspects of formant contours to the improved intelligibility of clear speech.

[1]  Jean C. Krause,et al.  Acoustic properties of naturally produced clear speech at normal speaking rates. , 1996, The Journal of the Acoustical Society of America.

[2]  Valerie Hazan,et al.  Acoustic-phonetic correlates of talker intelligibility for adults and children. , 2004, The Journal of the Acoustical Society of America.

[3]  John-Paul Hosom,et al.  Hybridizing conversational and clear speech to determine the degree of contribution of acoustic features to intelligibility. , 2008, The Journal of the Acoustical Society of America.

[4]  B. Lindblom,et al.  Interaction between duration, context, and speaking style in English stressed vowels , 1994 .

[5]  John-Paul Hosom,et al.  The effect of formant trajectories and phoneme durations on vowel intelligibility , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  J. Perkell,et al.  Economy of effort in different speaking conditions. I. A preliminary study of intersubject differences and modeling issues. , 2002, The Journal of the Acoustical Society of America.

[7]  J. C. Krause,et al.  Acoustic properties of naturally produced clear speech at normal speaking rates. , 1996, The Journal of the Acoustical Society of America.

[8]  A. Wingfield,et al.  Speed of processing in normal aging: effects of speech rate, linguistic structure, and processing time. , 1985, Journal of gerontology.

[9]  K S Helfer,et al.  Auditory and auditory-visual recognition of clear and conversational speech by older adults. , 1998, Journal of the American Academy of Audiology.

[10]  Nina Kraus,et al.  Speaking clearly for children with learning disabilities: sentence perception in noise. , 2003, Journal of speech, language, and hearing research : JSLHR.

[11]  M. Picheny,et al.  Speaking clearly for the hard of hearing. II: Acoustic characteristics of clear and conversational speech. , 1986, Journal of speech and hearing research.

[12]  S. Furui On the role of spectral transition for speech perception. , 1986, The Journal of the Acoustical Society of America.

[13]  John-Paul Hosom,et al.  Improving the intelligibility of dysarthric speech , 2007, Speech Commun..

[14]  Sheng Liu,et al.  Clear speech perception in acoustic and electric hearing. , 2004, The Journal of the Acoustical Society of America.

[15]  John-Paul Hosom,et al.  Hybridizing conversational and clear speech , 2007, INTERSPEECH.

[16]  Jonas Beskow,et al.  Wavesurfer - an open source speech tool , 2000, INTERSPEECH.

[17]  John-Paul Hosom,et al.  A review of research on speech intelligibility and correlations with acoustic features , 2011 .

[18]  N I Durlach,et al.  Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech. , 1985, Journal of speech and hearing research.

[19]  T. M. Nearey,et al.  Identification of resynthesized /hVd/ utterances: effects of formant contour. , 1999, The Journal of the Acoustical Society of America.

[20]  Jean C. Krause,et al.  Investigating alternative forms of clear speech: the effects of speaking rate and speaking mode on intelligibility. , 2002, The Journal of the Acoustical Society of America.

[21]  S. H. Ferguson,et al.  Talker differences in clear and conversational speech: vowel intelligibility for normal-hearing listeners. , 2004, The Journal of the Acoustical Society of America.

[22]  D. Kewley-Port,et al.  Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. , 2002, The Journal of the Acoustical Society of America.

[23]  John-Paul Hosom,et al.  Speaker-independent phoneme alignment using transition-dependent states , 2009, Speech Commun..