AUDITORY-PHONETIC PROJECTION AND LEXICAL STRUCTURE IN THE RECOGNITION OF SINE-WAVE WORDS.

Speech remains intelligible despite the elimination of canonical acoustic correlates of phonemes from the spectrum. A portion of this perceptual flexibility can be attributed to modulation sensitivity in the auditory-to-phonetic projection, though signal-independent properties of lexical neighborhoods also affect intelligibility in utterances composed of words. Three tests were conducted to estimate the effects of exposure to natural and sine-wave samples of speech in this kind of perceptual versatility. First, sine-wave versions of the easy/hard word sets were created, modeled on the speech samples of a single talker. The performance difference in recognition of easy and hard words was used to index the perceptual reliance on signal-independent properties of lexical contrasts. Second, several kinds of exposure produced familiarity with an aspect of sine-wave speech: 1) sine-wave sentences modeled on the same talker; 2) sine-wave sentences modeled on a different talker, to create familiarity with a sine-wave carrier; and 3) natural sentences spoken by the same talker, to create familiarity with the idiolect expressed in the sine-wave words. Recognition performance with both easy and hard sine-wave words improved after exposure only to sine-wave sentences modeled on the same talker. Third, a control test showed that signal-independent uncertainty is a plausible cause of differences in recognition of easy and hard sine-wave words. The conditions of beneficial exposure reveal the specificity of attention underlying versatility in speech perception.

[1]  Robert E. Remez,et al.  Perceiving the sex and identity of a talker without natural vocal timbre , 1997, Perception & psychophysics.

[2]  David B. Pisoni,et al.  Learning to recognize talkers from natural, sinewave, and reversed speech samples. , 2002 .

[3]  Tessa Bent,et al.  The clear speech effect for non-native listeners. , 2002, The Journal of the Acoustical Society of America.

[4]  P. Bertelson,et al.  Visual Recalibration of Auditory Speech Identification , 2003, Psychological science.

[5]  Matthew H. Davis,et al.  Perceptual learning of noise vocoded words: effects of feedback and lexicality. , 2008, Journal of experimental psychology. Human perception and performance.

[6]  J. McQueen,et al.  Perceptual learning in speech: Stability over time (L) , 2006 .

[7]  Anne Cutler,et al.  The 34th Sir Frederick Bartlett Lecture: The abstract representations in speech processing , 2008, Quarterly journal of experimental psychology.

[8]  J. Mullennix,et al.  Some effects of talker variability on spoken word recognition. , 1989, The Journal of the Acoustical Society of America.

[9]  Frédéric E. Theunissen,et al.  The Modulation Transfer Function for Speech Intelligibility , 2009, PLoS Comput. Biol..

[10]  P E Rubin,et al.  On the perception of speech from time-varying acoustic information: Contributions of amplitude variation , 1990, Perception & psychophysics.

[11]  Franklin S. Cooper,et al.  IN SEARCH OF THE ACOUSTIC CUES , 1972 .

[12]  D. Pisoni,et al.  Some observations on representations and representational specificity in speech perception and spoken word recognition , 2005 .

[13]  A. Bregman Auditory Scene Analysis , 2008 .

[14]  Robert E Remez,et al.  On the perception of similarity among talkers. , 2007, The Journal of the Acoustical Society of America.

[15]  Jennifer M. Fellowes,et al.  Talker identification based on phonetic information. , 1997, Journal of experimental psychology. Human perception and performance.

[16]  A. Samuel,et al.  Accommodating variation: Dialects, idiolects, and speech processing , 2008, Cognition.

[17]  Matthew H. Davis,et al.  Lexical information drives perceptual learning of distorted speech: evidence from the comprehension of noise-vocoded sentences. , 2005, Journal of experimental psychology. General.

[18]  E. Gibson Principles of Perceptual Learning and Development , 1969 .

[19]  E. Liebenthal,et al.  Short-Term Reorganization of Auditory Analysis Induced by Phonetic Experience , 2003, Journal of Cognitive Neuroscience.

[20]  Jennifer S. Pardo,et al.  The Perception of Speech , 2006 .

[21]  L. Raphael Acoustic Cues to the Perception of Segmental Phonemes , 2008, The Handbook of Speech Perception.

[22]  G. A. Miller,et al.  Finitary models of language users , 1963 .

[23]  Douglas S Brungart,et al.  Monaural speech segregation using synthetic speech signals. , 2006, The Journal of the Acoustical Society of America.

[24]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[25]  R E Remez,et al.  Perceptual normalization of vowels produced by sinusoidal voices. , 1987, Journal of experimental psychology. Human perception and performance.

[26]  Joanne L. Miller,et al.  Speech Perception , 1990, Springer Handbook of Auditory Research.

[27]  D. Pisoni,et al.  Recognizing Spoken Words: The Neighborhood Activation Model , 1998, Ear and hearing.

[28]  D B Pisoni,et al.  Stimulus variability and spoken word recognition. I. Effects of variability in speaking rate and overall amplitude. , 1994, The Journal of the Acoustical Society of America.

[29]  Zachary M. Smith,et al.  Chimaeric sounds reveal dichotomies in auditory perception , 2002, Nature.

[30]  D. Pisoni,et al.  Recognition of spoken words by native and non-native listeners: talker-, listener-, and item-related factors. , 1999, The Journal of the Acoustical Society of America.

[31]  Robert E. Remez,et al.  Sine-wave speech , 2008, Scholarpedia.

[32]  D B Pisoni,et al.  Some Considerations in Evaluating Spoken Word Recognition by Normal‐Hearing, Noise‐Masked Normal‐Hearing, and Cochlear Implant Listeners. I: The Effects of Response Format , 1997, Ear and hearing.

[33]  D. Pisoni,et al.  The Handbook of Speech Perception , 2004 .

[34]  J. D. Trout Lexical boosting of noise-band speech in open- and closed-set formats , 2005, Speech Commun..

[35]  David B. Pisoni,et al.  Speech perception, word recognition and the structure of the lexicon , 1985, Speech Commun..

[36]  Constance M. Clarke,et al.  Rapid adaptation to foreign-accented English. , 2004, The Journal of the Acoustical Society of America.

[37]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[38]  Ruben van de Vijver,et al.  Pisoni, D., Remez, R. (eds.), The handbook of speech perception; Oxford, Blackwell, 2005 , 2009 .

[39]  D. Pisoni,et al.  Talker-specific learning in speech perception , 1998, Perception & psychophysics.

[40]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[41]  Paul A Luce,et al.  Representation of lexical form. , 2003, Journal of experimental psychology. Learning, memory, and cognition.

[42]  Jennifer S. Pardo,et al.  On the Bistability of Sine Wave Analogues of Speech , 2001, Psychological science.

[43]  Jennifer S. Pardo,et al.  On the perceptual organization of speech. , 1994, Psychological review.

[44]  D. Pisoni,et al.  Speech Perception as a Talker-Contingent Process , 1993, Psychological science.

[45]  S. Greenspan,et al.  Perceptual learning of synthetic speech produced by rule. , 1988, Journal of experimental psychology. Learning, memory, and cognition.

[46]  S. Goldinger Echoes of echoes? An episodic theory of lexical access. , 1998, Psychological review.

[47]  Joanne L. Miller,et al.  Listener sensitivity to individual talker differences in voice-onset-time. , 2004, The Journal of the Acoustical Society of America.

[48]  D. Pisoni,et al.  Speech perception without traditional speech cues. , 1981, Science.