Talker-specific learning in speech perception

The effects of perceptual learning of talker identity on the recognition of spoken words and sentences were investigated in three experiments. In each experiment, listeners were trained to learn a set of 10 talkers’ voices and were then given an intelligibility test to assess the influence of learning the voices on the processing of the linguistic content of speech. In the first experiment, listeners learned voices from isolated words and were then tested with novel isolated words mixed in noise. The results showed that listeners who were given words produced by familiar talkers at test showed better identification performance than did listeners who were given words produced by unfamiliar talkers. In the second experiment, listeners learned novel voices from sentence-length utterances and were then presented with isolated words. The results showed that learning a talker’s voice from sentences did not generalize well to identification of novel isolated words. In the third experiment, listeners learned voices from sentence-length utterances and were then given sentence-length utterances produced by familiar and unfamiliar talkers at test. We found that perceptual learning of novel voices from sentence-length utterances improved speech intelligibility for words in sentences. Generalization and transfer from voice learning to linguistic processing was found to be sensitive to the talker-specific information available during learning and test. These findings demonstrate that increased sensitivity to talker-specific information affects the perception of the linguistic properties of speech in isolated words and sentences.

[1]  J. P. Egan Articulation testing methods , 1948, The Laryngoscope.

[2]  D. Lawrence Acquired distinctiveness of cues; transfer between discrimination on the basis of familiarity with the stimulus. , 1949, Journal of experimental psychology.

[3]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[4]  Irwin Pollack,et al.  On the Identification of Speakers by Voice , 1954 .

[5]  J. Gibson,et al.  Perceptual learning; differentiation or enrichment? , 1955, Psychological review.

[6]  C. Douglas Creelman,et al.  Case of the Unknown Talker , 1957 .

[7]  D. Broadbent,et al.  Information Conveyed by Vowels , 1957 .

[8]  J. Wohlwill The definition and analysis of perceptual learning. , 1958, Psychological review.

[9]  R. Shepard,et al.  Retention of information under conditions approaching a steady state. , 1961, Journal of experimental psychology.

[10]  P. Ladefoged,et al.  Speaker Identification and Message Identification in Speech Recognition , 1963 .

[11]  K. D. Kryter,et al.  ARTICULATION-TESTING METHODS: CONSONANTAL DIFFERENTIATION WITH A CLOSED-RESPONSE SET. , 1965, The Journal of the Acoustical Society of America.

[12]  Carl E. Williams,et al.  The Effects of Selected Factors on The Aural Identification of Speakers , 1965 .

[13]  D. Abercrombie,et al.  Elements of General Phonetics , 1967 .

[14]  E. Gibson Principles of Perceptual Learning and Development , 1969 .

[15]  P. Costanzo,et al.  Voice Quality Profile and Perceived Emotion. , 1969 .

[16]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[17]  N. Markel,et al.  The Relationship Between Words and Tone-of-Voice , 1973, Language and speech.

[18]  W. R. Garner The Processing of Information and Structure , 1974 .

[19]  M Coltheart,et al.  Memory of a Speaker's Voice: Reaction Time to Same- or Different-voiced Letters , 1974, The Quarterly journal of experimental psychology.

[20]  F. Craik,et al.  The Effect of Speaker's Voice on Word Recognition , 1974 .

[21]  David J. Ostry,et al.  Time course of loss of information regarding pattern analyzing operations , 1974 .

[22]  M. P. Haggard,et al.  Vocal Tract Normalisation as Demonstrated by Reaction Times , 1975 .

[23]  P. Mermelstein,et al.  Speech sounds and features , 1975, Proceedings of the IEEE.

[24]  F. Bellezza,et al.  Long-term memory for speaker’s voice and source location , 1976, Memory & cognition.

[25]  P A Kolers,et al.  Pattern-analyzing memory. , 1976, Science.

[26]  D. Shankweiler,et al.  What information enables a listener to map a talker's vowel space? , 1976, The Journal of the Acoustical Society of America.

[27]  Peter D. Bricker,et al.  chapter 9 – Speaker Recognition , 1976 .

[28]  N. Lass Contemporary Issues in Experimental Phonetics , 1976 .

[29]  R E Geiselman,et al.  Incidental retention of speaker’s voice , 1977, Memory & cognition.

[30]  R. Shaw,et al.  Perceiving, Acting and Knowing : Toward an Ecological Psychology , 1978 .

[31]  S. Blumstein,et al.  Invariant cues for place of articulation in stop consonants. , 1978, The Journal of the Acoustical Society of America.

[32]  E. Rosch,et al.  Cognition and Categorization , 1980 .

[33]  R E Geiselman Inhibition of the automatic storage of speaker’s voice , 1979, Memory & cognition.

[34]  P. Trudgill,et al.  Phonetic and linguistic markers in speech , 1979 .

[35]  P. Ladefoged WHAT ARE LINGUISTIC SOUNDS MADE OF , 1980 .

[36]  J. T. Hogan,et al.  Vowel identification: orthographic, perceptual, and acoustic aspects. , 1982, The Journal of the Acoustical Society of America.

[37]  Ralph E. Geiselman,et al.  Incidental processing of speaker characteristics: voice as connotative information , 1983 .

[38]  J. Fodor The Modularity of mind. An essay on faculty psychology , 1986 .

[39]  L. Brooks,et al.  Nonanalytic Cognition: Memory, Perception, and Concept Learning , 1984 .

[40]  W. Schneider,et al.  Automatic category search and its transfer. , 1984, Journal of experimental psychology. Learning, memory, and cognition.

[41]  Gordon E. Legge,et al.  Learning unfamiliar voices , 1984 .

[42]  D. Dennett Carving the Mind at its Joints. , 1984 .

[43]  W. Strange,et al.  Effects of discrimination training on the perception of /r-l/ by Japanese adults learning English , 1984, Perception & psychophysics.

[44]  H. Nusbaum Sizing up the Hoosier Mental Lexicon: Measuring the Familiarity of 20,000 Words, Research on Speech Perception , 1984 .

[45]  G.R. Doddington,et al.  Speaker recognition—Identifying people by their voices , 1985, Proceedings of the IEEE.

[46]  D. Lancker,et al.  Familiar voice recognition: Patterns and parameters. Part I. Recognition of backward voices , 1985 .

[47]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.

[48]  E C Schwab,et al.  Some Effects of Training on the Perception of Synthetic Speech , 1985, Human factors.

[49]  D. Lancker,et al.  Familiar voice recognition: Parameters and patterns. Part II. Recognition of rate-altered voices , 1985 .

[50]  D. Pisoni Speech perception: some new directions in research and theory. , 1985, The Journal of the Acoustical Society of America.

[51]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[52]  Douglas L. Hintzman,et al.  "Schema Abstraction" in a Multiple-Trace Memory Model , 1986 .

[53]  D. O'Shaughnessy,et al.  Speaker recognition , 1986, IEEE ASSP Magazine.

[54]  C. Fowler An event approach to the study of speech perception from a direct realist perspective , 1986 .

[55]  James D. Miller Auditory‐perceptual interpretation of the vowel , 1987 .

[56]  D. Lancker,et al.  Voice discrimination and recognition are separate abilities , 1987, Neuropsychologia.

[57]  Jay L. Garfield,et al.  Modularity in Knowledge Representation and Natural-Language Understanding , 1987 .

[58]  R. Nosofsky Attention and learning processes in the identification and categorization of integral stimuli. , 1987, Journal of experimental psychology. Learning, memory, and cognition.

[59]  P. Change Some Thoughts on , 1988 .

[60]  S. Greenspan,et al.  Perceptual learning of synthetic speech produced by rule. , 1988, Journal of experimental psychology. Learning, memory, and cognition.

[61]  B. Dobkin,et al.  Phonagnosia: A Dissociation Between Familiar and Unfamiliar Voices , 1988, Cortex.

[62]  J. D. Miller,et al.  Auditory-perceptual interpretation of the vowel. , 1989, The Journal of the Acoustical Society of America.

[63]  J. Mullennix,et al.  Effects of talker variability on recall of spoken word lists. , 1989, Journal of experimental psychology. Learning, memory, and cognition.

[64]  T. M. Nearey Static, dynamic, and relational properties in vowel perception. , 1989, The Journal of the Acoustical Society of America.

[65]  J. Mullennix,et al.  Some effects of talker variability on spoken word recognition. , 1989, The Journal of the Acoustical Society of America.

[66]  D B Pisoni,et al.  Stimulus variability and processing dependencies in speech perception , 1990, Perception & psychophysics.

[67]  D. Schacter Perceptual Representation Systems and Implicit Memory , 1990, Annals of the New York Academy of Sciences.

[68]  Keith Johnson,et al.  The role of perceived speaker identity in F0 normalization of vowels. , 1990, The Journal of the Acoustical Society of America.

[69]  P. Kuhl Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not , 1991, Perception & psychophysics.

[70]  D. Lancker,et al.  Personal relevance and the human right hemisphere , 1991, Brain and Cognition.

[71]  David B. Pisoni,et al.  On the nature of talker variability effects on recall of spoken word lists. , 1991 .

[72]  D. Pisoni,et al.  Training Japanese listeners to identify English /r/ and /l/: a first report. , 1991, The Journal of the Acoustical Society of America.

[73]  Geoffrey Hall Perceptual and associative learning , 1991 .

[74]  David B. Pisoni,et al.  Similarity neighborhoods of spoken words , 1991 .

[75]  S. Goldinger,et al.  On the nature of talker variability effects on recall of spoken word lists. , 1991, Journal of experimental psychology. Learning, memory, and cognition.

[76]  Edward S. Reed,et al.  An Odyssey in Learning and Perception. , 1992 .

[77]  Patricia K. Kuhl,et al.  Psychoacoustics and speech perception: Internal standards, perceptual anchors, and prototypes. , 1992 .

[78]  David B. Pisoni,et al.  Long-term memory in speech perception: Some new findings on talker variability, speaking rate and perceptual learning , 1993, Speech Commun..

[79]  D. Pisoni,et al.  Training Japanese listeners to identify English /r/ and /l/. II: The role of phonetic environment and talker variability in learning new perceptual categories. , 1993, The Journal of the Acoustical Society of America.

[80]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[81]  S. Goldinger,et al.  Episodic encoding of voice attributes and recognition memory for spoken words. , 1993, Journal of experimental psychology. Learning, memory, and cognition.

[82]  D. Pisoni,et al.  SPEECH PERCEPTION AS A TALKER-CONTINGENT PROCESS. , 1993, Psychological science.

[83]  Modeling the effect of learning voices on the perception of speech , 1994 .

[84]  D. Schacter,et al.  Perceptual specificity of auditory priming: implicit memory for voice intonation and fundamental frequency. , 1994, Journal of experimental psychology. Learning, memory, and cognition.

[85]  D. Pisoni,et al.  Training Japanese listeners to identify English /r/ and /l/. III. Long-term retention of new phonetic categories. , 1994, The Journal of the Acoustical Society of America.

[86]  Robert L. Goldstone Influences of categorization on perceptual discrimination. , 1994, Journal of experimental psychology. General.

[87]  D B Pisoni,et al.  Stimulus variability and spoken word recognition. I. Effects of variability in speaking rate and overall amplitude. , 1994, The Journal of the Acoustical Society of America.

[88]  D. Pisoni,et al.  Speech Perception as a Talker-Contingent Process , 1993, Psychological science.

[89]  On the contribution of instance-specific characteristics to speech perception , 1995 .

[90]  D. Pisoni,et al.  Effects of stimulus variability on perception and representation of spoken words in memory , 1995, Perception & psychophysics.

[91]  S. Goldinger Words and voices: episodic traces in spoken word identification and recognition memory. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[92]  Ma Conway,et al.  Handbook of perception and cognition , 1996 .

[93]  David B. Pisoni,et al.  Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics , 1996, Speech Commun..

[94]  Jennifer M. Fellowes,et al.  Talker identification based on phonetic information. , 1997, Journal of experimental psychology. Human perception and performance.

[95]  Sadaoki Furui,et al.  Speaker recognition , 1997, Scholarpedia.

[96]  Emmanuel Dupoux,et al.  Perceptual adjustment to highly compressed speech: effects of talker and rate changes. , 1997, Journal of experimental psychology. Human perception and performance.