A review of research on speech intelligibility and correlations with acoustic features

This review article provides an overview of differences between conversational (or cnv) and clear (or clr) speech, for a variety of speakers, in terms of speech intelligibility, and in terms of acoustic characteristics. Researchers have studied the relationship between acoustic features and speech intelligibility by, for example, studying correlations. However, the question “which acoustic features of clr speech cause it to be more intelligible” is still unanswered. To approach this question, it is valuable to summarize past findings related to speech intelligibility and their relationship with acoustic features, while not being limited to a review of only clr speech materials. The outcome of this review can then be applied to restrict the search space in the process of finding acoustic features that contribute to increased speech intelligibility. Finally, we review computer-processing techniques to improve speech intelligibility, summarize these findings, and propose future modifications. Keyword: speech intelligibility, clear speech, speech modification, elderly listeners

[1]  J. M. Pickett,et al.  Effects of Vocal Force on the Intelligibility of Speech Sounds , 1956 .

[2]  Valerie Hazan,et al.  Acoustic-phonetic correlates of talker intelligibility for adults and children. , 2004, The Journal of the Acoustical Society of America.

[3]  J. C. Krause,et al.  Acoustic properties of naturally produced clear speech at normal speaking rates. , 1996, The Journal of the Acoustical Society of America.

[4]  Tessa Bent,et al.  The clear speech effect for non-native listeners. , 2002, The Journal of the Acoustical Society of America.

[5]  J J O'NEILL,et al.  Effects of ambient noise on speaker intelligibility of words and phrases , 1958, The Laryngoscope.

[6]  M. Daneman,et al.  How young and old adults listen to and remember speech in noise. , 1995, The Journal of the Acoustical Society of America.

[7]  N I Durlach,et al.  Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech. , 1985, Journal of speech and hearing research.

[8]  V. Narne,et al.  Effect of Envelope Enhancement on Speech Perception in Individuals with Auditory Neuropathy , 2007, Ear and hearing.

[9]  S. Gordon-Salant Recognition of natural and time/intensity altered CVs by young and elderly subjects with normal hearing. , 1986, The Journal of the Acoustical Society of America.

[10]  R L Diehl,et al.  Perception of vowel height: the role of F1-F0 distance. , 1994, The Journal of the Acoustical Society of America.

[11]  Takayuki Arai,et al.  Modulation enhancement of speech by a pre-processing algorithm for improving intelligibility in reverberant environments , 2005, Speech Commun..

[12]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[13]  B C Moore,et al.  Evaluation of the effect of speech-rate slowing on speech intelligibility in noise using a simulation of cochlear hearing loss. , 1998, The Journal of the Acoustical Society of America.

[14]  A. Wingfield,et al.  Word onset gating and linguistic context in spoken word recognition by young and elderly adults. , 1991, Journal of gerontology.

[15]  M. Picheny,et al.  Speaking clearly for the hard of hearing. II: Acoustic characteristics of clear and conversational speech. , 1986, Journal of speech and hearing research.

[16]  Sheng Liu,et al.  Temporal properties in clear speech perception. , 2006, The Journal of the Acoustical Society of America.

[17]  K S Helfer,et al.  Auditory and auditory-visual recognition of clear and conversational speech by older adults. , 1998, Journal of the American Academy of Audiology.

[18]  A Wingfield,et al.  Process and strategy in memory for speech among younger and older adults. , 1987, Psychology and aging.

[19]  John G. Harris,et al.  Applied principles of clear and Lombard speech for automated intelligibility enhancement in noisy environments , 2006, Speech Commun..

[20]  R. Smits,et al.  Evaluation of various sets of acoustic cues for the perception of prevocalic stop consonants. I. Perception experiment. , 1996, The Journal of the Acoustical Society of America.

[21]  Sheng Liu,et al.  Clear speech perception in acoustic and electric hearing. , 2004, The Journal of the Acoustical Society of America.

[22]  S. Gordon-Salant,et al.  Sources of age-related recognition difficulty for time-compressed speech. , 2001, Journal of speech, language, and hearing research : JSLHR.

[23]  S. Gordon-Salant,et al.  Effects of acoustic modification on consonant recognition by elderly hearing-impaired subjects. , 1987, The Journal of the Acoustical Society of America.

[24]  L D Braida,et al.  Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. , 1994, The Journal of the Acoustical Society of America.

[25]  D H Klatt,et al.  Review of text-to-speech conversion for English. , 1987, The Journal of the Acoustical Society of America.

[26]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[27]  R. Caissie,et al.  Clear speech for adults with a hearing loss: does intervention with communication partners make a difference? , 2005, Journal of the American Academy of Audiology.

[28]  S. H. Ferguson,et al.  Talker differences in clear and conversational speech: vowel intelligibility for normal-hearing listeners. , 2004, The Journal of the Acoustical Society of America.

[29]  Andrew C. Simpson,et al.  The effect of cue-enhancement on the intelligibility of nonsense word and sentence materials presented in noise , 1998, Speech Commun..

[30]  A. Jongman,et al.  Perception of clear fricatives by normal-hearing and simulated hearing-impaired listeners. , 2008, The Journal of the Acoustical Society of America.

[31]  Douglas D. O'Shaughnessy,et al.  Speech communications - human and machine, 2nd Edition , 2000 .

[32]  Joe Barcroft,et al.  Stimulus variability and the phonetic relevance hypothesis: effects of variability in speaking style, fundamental frequency, and speaking rate on spoken word identification. , 2006, The Journal of the Acoustical Society of America.

[33]  C W Turner,et al.  Formant transition duration and speech recognition in normal and hearing-impaired listeners. , 1997, The Journal of the Acoustical Society of America.

[34]  B. Lindblom,et al.  Interaction between duration, context, and speaking style in English stressed vowels , 1994 .

[35]  L. Braida,et al.  Speaking clearly for the hard of hearing IV: Further studies of the role of speaking rate. , 1996, Journal of speech and hearing research.

[36]  J C Junqua,et al.  The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.

[37]  N I Durlach,et al.  Speaking clearly for the hard of hearing. III: An attempt to determine the contribution of speaking rate to differences in intelligibility between clear and conversational speech. , 1989, Journal of speech and hearing research.

[38]  T. M. Nearey,et al.  Identification of resynthesized /hVd/ utterances: effects of formant contour. , 1999, The Journal of the Acoustical Society of America.

[39]  H. Lane,et al.  The Lombard Sign and the Role of Hearing in Speech , 1971 .

[40]  Jean C. Krause,et al.  Investigating alternative forms of clear speech: the effects of speaking rate and speaking mode on intelligibility. , 2002, The Journal of the Acoustical Society of America.

[41]  John-Paul Hosom,et al.  Hybridizing conversational and clear speech , 2007, INTERSPEECH.

[42]  A. Boothroyd,et al.  Mathematical treatment of context effects in phoneme and word recognition. , 1988, The Journal of the Acoustical Society of America.

[43]  D B Pisoni,et al.  Stimulus variability and spoken word recognition. I. Effects of variability in speaking rate and overall amplitude. , 1994, The Journal of the Acoustical Society of America.

[44]  R. H. Bernacki,et al.  Effects of noise on speech production: acoustic and perceptual analyses. , 1988, The Journal of the Acoustical Society of America.

[45]  Nina Kraus,et al.  Speaking clearly for children with learning disabilities: sentence perception in noise. , 2003, Journal of speech, language, and hearing research : JSLHR.

[46]  S. Gordon-Salant,et al.  Selected cognitive factors and speech recognition performance among young and elderly listeners. , 1997, Journal of speech, language, and hearing research : JSLHR.

[47]  David B. Pisoni,et al.  Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics , 1996, Speech Commun..

[48]  D J Schum,et al.  Intelligibility of clear and conversational speech of young and elderly talkers. , 1996, Journal of the American Academy of Audiology.

[49]  R. Plomp,et al.  Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[50]  D. Kewley-Port,et al.  Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. , 2002, The Journal of the Acoustical Society of America.

[51]  Zinny S. Bond,et al.  A note on the acoustic-phonetic characteristics of inadvertently clear speech , 1994, Speech Commun..

[52]  J. Liénard,et al.  Effect of vocal effort on spectral properties of vowels. , 1999, The Journal of the Acoustical Society of America.

[53]  J. Hillenbrand,et al.  Some effects of duration on vowel recognition. , 2000, The Journal of the Acoustical Society of America.