The role of outer hair cell function in the perception of synthetic versus natural speech

Hearing loss as assessed by pure-tone audiometry (PTA) is significantly correlated with the intelligibility of synthetic speech. However, PTA is a subjective audiological measure that assesses the entire auditory pathway and does not discriminate between the different afferent and efferent contributions. In this paper, we focus on one particular aspect of hearing that has been shown to correlate with hearing loss: outer hair cell (OHC) function. One role of OHCs is to increase sensitivity and frequency selectivity. This function of OHCs can be assessed quickly and objectively through otoacoustic emissions (OAE) testing, which is little known outside the field of audiology. We find that OHC function affects the perception of human speech, but not that of synthetic speech. This has important implications not just for audiological and electrophysiological research, but also for adapting speech synthesis to ageing ears.

[1]  J. Jerger,et al.  The prevalence of central presbyacusis in a clinical population. , 1990, Journal of the American Academy of Audiology.

[2]  D. J. Arnold,et al.  High-frequency hearing influences lower-frequency distortion-product otoacoustic emissions. , 1999, Archives of otolaryngology--head & neck surgery.

[3]  Matthew P. Aylett,et al.  The Cerevoice Blizzard Entry 2006: A Prototype Small Database Unit Selection Engine , 2006 .

[4]  D. T. Kemp,et al.  Evidence of mechanical nonlinearity and frequency selective wave amplification in the cochlea , 2004, Archives of oto-rhino-laryngology.

[5]  Martha E. Pollack,et al.  Intelligent Technology for an Aging Population: The Use of AI to Assist Elders with Cognitive Impairment , 2005, AI Mag..

[6]  E. Laukli,et al.  Threshold of hearing (0.125-20 kHz) in children and youngsters. , 1992, British journal of audiology.

[7]  Pauline Campbell,et al.  THE EFFECT OF HEARING LOSS ON THE INTELLIGIBILITY OF SYNTHETIC SPEECH , 2007 .

[8]  H. Levitt,et al.  Predicting consonant confusions from acoustic analysis. , 1981, The Journal of the Acoustical Society of America.

[9]  D. Kemp Stimulated acoustic emissions from within the human auditory system. , 1978, The Journal of the Acoustical Society of America.

[10]  B L Lonsbury-Martin,et al.  The Clinical Utility of Distortion‐Product Otoacoustic Emissions , 1990, Ear and hearing.

[11]  Randall W. Engle,et al.  Simple and complex memory spans and their relation to fluid abilities: Evidence from list-length effects , 2006 .

[12]  Neil Charness,et al.  Age Differences in Identifying Words in Synthetic Speech , 2007, Hum. Factors.

[13]  Judy R Dubno,et al.  Longitudinal Study of Pure-Tone Thresholds in Older Persons , 2005, Ear and hearing.

[14]  Thomas A Trikalinos,et al.  Diagnosis of Sensorineural Hearing Loss with Neural Networks versus Logistic Regression Modeling of Distortion Product Otoacoustic Emissions , 2004, Audiology and Neurotology.

[15]  L E Humes,et al.  Recognition of synthetic speech by hearing-impaired elderly listeners. , 1991, Journal of speech and hearing research.

[16]  Maria Klara Wolters,et al.  Making speech synthesis more accessible to older people , 2007, SSW.

[17]  A.W. Black,et al.  Using speech in noise to improve understandability for elderly listeners , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[18]  James F. Willott,et al.  Aging and the Auditory System: Anatomy, Physiology, and Psychophysics , 1991 .