Learning to recognize talkers from natural, sinewave, and reversed speech samples.

In 5 experiments, the authors investigated how listeners learn to recognize unfamiliar talkers and how experience with specific utterances generalizes to novel instances. Listeners were trained over several days to identify 10 talkers from natural, sinewave, or reversed speech sentences. The sinewave signals preserved phonetic and some suprasegmental properties while eliminating natural vocal quality. In contrast, the reversed speech signals preserved vocal quality while distorting temporally based phonetic properties. The training results indicate that listeners learned to identify talkers even from acoustic signals lacking natural vocal quality. Generalization performance varied across the different signals and depended on the salience of phonetic information. The results suggest similarities in the phonetic attributes underlying talker recognition and phonetic perception.

[1]  P K Kuhl,et al.  The encoding of rate and talker information during phonetic perception , 1997, Perception & psychophysics.

[2]  R O Coleman,et al.  Speaker identification in the absence of inter-subject differences in glottal source characteristics. , 1973, The Journal of the Acoustical Society of America.

[3]  K. Scherer,et al.  Social Markers in Speech , 1980 .

[4]  D. Shankweiler,et al.  What information enables a listener to map a talker's vowel space? , 1974, The Journal of the Acoustical Society of America.

[5]  D. O'Shaughnessy,et al.  Speaker recognition , 1986, IEEE ASSP Magazine.

[6]  B E Walden,et al.  Correlates of psychological dimensions in talker similarity. , 1978, Journal of speech and hearing research.

[7]  William D. Marslen-Wilson,et al.  Lexical Representation and Process , 1991 .

[8]  A. Cohen,et al.  Structure and Process in Speech Perception , 1975 .

[9]  W. Meyer-Eppler Realization of Prosodic Features in Whispered Speech , 1956 .

[10]  Carol A. Fowler,et al.  THE EFFECTS OF VOICE AND VISIBLE SPEAKER CHANGE ON MEMORY FOR SPOKEN WORDS , 1995 .

[11]  D. Pisoni,et al.  Speech perception without traditional speech cues. , 1981, Science.

[12]  J. Kreiman,et al.  Perceptual evaluation of voice quality: review, tutorial, and a framework for future research. , 1993, Journal of speech and hearing research.

[13]  D. Klatt Review of selected models of speech perception , 1989 .

[14]  S. Pruzansky,et al.  Effects of stimulus content and duration on talker identification. , 1966, The Journal of the Acoustical Society of America.

[15]  A. J. Compton,et al.  Effects of Filtering and Vocal Duration upon the Identification of Speakers, Aurally , 1963 .

[16]  S. M. Sheffert,et al.  Contributions of surface and conceptual information to recognition memory , 1998, Perception & psychophysics.

[17]  F. Craik,et al.  Earwitness identification: Some influences on voice recognition. , 1995 .

[18]  D. Broadbent,et al.  Information Conveyed by Vowels , 1957 .

[19]  V. Tartter Happy talk: Perceptual and acoustic effects of smiling on speech , 1980, Perception & psychophysics.

[20]  P. Trudgill Sociolinguistics : an introduction , 1975 .

[21]  Edward C. Carterette,et al.  Recognition Memory for Voices , 1975 .

[22]  R. Campbell,et al.  Hearing by eye 2 : advances in the psychology of speechreading and auditory-visual speech , 1997 .

[23]  Jennifer M. Fellowes,et al.  Talker identification based on phonetic information. , 1997, Journal of experimental psychology. Human perception and performance.

[24]  L E Spencer,et al.  Speech characteristics of male-to-female transsexuals: a perceptual and acoustic study. , 1988, Folia phoniatrica.

[25]  S. Schweinberger,et al.  Asymmetric relationships among perceptions of facial identity, emotion, and facial speech. , 1998, Journal of experimental psychology. Human perception and performance.

[26]  J. Perkell,et al.  Invariance and variability in speech processes , 1987 .

[27]  M Coltheart,et al.  Memory of a Speaker's Voice: Reaction Time to Same- or Different-voiced Letters , 1974, The Quarterly journal of experimental psychology.

[28]  D. Lancker,et al.  Voice discrimination and recognition are separate abilities , 1987, Neuropsychologia.

[29]  J. Laver The phonetic description of voice quality , 1980 .

[30]  Frank R. Clarke,et al.  CHARACTERISTICS THAT DETERMINE SPEAKER RECOGNITION. , 1966 .

[31]  J. Wilding,et al.  Earwitness Testimony: Never Mind the Variety, Hear the Length , 1997 .

[32]  Kim E. A. Silverman,et al.  F₀ Segmental Cues Depend on Intonation: The Case of the Rise after Voiced Stops , 1986 .

[33]  F. McGehee The Reliability of the Identification of the Human Voice , 1937 .

[34]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[35]  Joanne L. Miller,et al.  Speech Perception , 1990, Springer Handbook of Auditory Research.

[36]  L D Rosenblum,et al.  Effects of talker variability on speechreading , 2000, Perception & psychophysics.

[37]  Norman J. Lass,et al.  Principles of Experimental Phonetics , 1996 .

[38]  D. Lancker,et al.  Impairment of voice and face recognition in patients with hemispheric damage , 1982, Brain and Cognition.

[39]  R E Remez,et al.  Perceptual normalization of vowels produced by sinusoidal voices. , 1987, Journal of experimental psychology. Human perception and performance.

[40]  P. Luce,et al.  Specificity of memory representations for spoken words , 1998, Memory & cognition.

[41]  A. Schmidt‐Nielsen,et al.  Identification of known voices as a function of familiarity and narrow‐band coding , 1985 .

[42]  David B. Pisoni,et al.  Alcohol and Speech , 1997 .

[43]  P. Ladefoged,et al.  Phonetic linguistics : essays in honor of Peter Ladefoged , 1987 .

[44]  P. Hepper,et al.  Newborn and fetal response to maternal voice , 1993 .

[45]  Irwin Pollack,et al.  On the Identification of Speakers by Voice , 1954 .

[46]  Robert E. Remez,et al.  Perceiving the sex and identity of a talker without natural vocal timbre , 1997, Perception & psychophysics.

[47]  Peter D. Bricker,et al.  chapter 9 – Speaker Recognition , 1976 .

[48]  D. Pisoni,et al.  Infants' Recognition of the Sound Patterns of Their Own Names , 1995, Psychological science.

[49]  N. Lass Contemporary Issues in Experimental Phonetics , 1976 .

[50]  M. P. Gelfer Perceptual attributes of voice: Development and use of rating scales , 1988 .

[51]  B. Dobkin,et al.  Phonagnosia: A Dissociation Between Familiar and Unfamiliar Voices , 1988, Cortex.

[52]  B. Clifford,et al.  The Voice-Recognition Accuracy of Blind Listeners , 1983, Perception.

[53]  D. Pisoni,et al.  Talker-specific learning in speech perception , 1998, Perception & psychophysics.

[54]  A. D. Yarmey Verbal, visual, and voice identification of a rape suspect under different levels of illumination. , 1986, The Journal of applied psychology.

[55]  R F Orlikoff,et al.  Speaker race identification from acoustic cues in the vocal signal. , 1994, Journal of speech and hearing research.

[56]  K. Scherer Vocal affect expression: a review and a model for future research. , 1986, Psychological bulletin.

[57]  S. M. Sheffert,et al.  Voice-specificity effects on auditory word priming , 1998, Memory & cognition.

[58]  R E Geiselman,et al.  Incidental retention of speaker’s voice , 1977, Memory & cognition.

[59]  S. Hiki,et al.  Multidimensional representation of personal quality of vowels and its acoustical correlates , 1973 .

[60]  L. Wallendael,et al.  ‘Earwitness’ voice recognition: Factors affecting accuracy and impact on jurors , 1994 .

[61]  H. S. Gopal,et al.  A perceptual model of vowel recognition based on the auditory representation of American English vowels. , 1986, The Journal of the Acoustical Society of America.

[62]  J Kreiman,et al.  Voice perception deficits: neuroanatomical correlates of phonagnosia. , 1989, Journal of clinical and experimental neuropsychology.

[63]  A. Benton,et al.  Impairment in facial recognition in patients with cerebral disease. , 1968, Transactions of the American Neurological Association.

[64]  T. C. Rand Vocal Tract Size Normalization in the Perception of Stop Consonants , 1971 .

[65]  C. Douglas Creelman,et al.  Case of the Unknown Talker , 1957 .

[66]  J. Mullennix,et al.  Talker Variability in Speech Processing , 1997 .

[67]  Dani Byrd,et al.  Relations of sex and dialect to reduction , 1994, Speech Communication.

[68]  W. Sommer,et al.  Recognizing famous voices: influence of stimulus duration and different types of retrieval cues. , 1997, Journal of speech, language, and hearing research : JSLHR.

[69]  G.R. Doddington,et al.  Speaker recognition—Identifying people by their voices , 1985, Proceedings of the IEEE.

[70]  Auditory priming: implicit and explicit memory for words and voices. , 1992 .

[71]  John Laver,et al.  The gift of speech , 1991 .

[72]  L. Rosenblum,et al.  Visual speech information for face recognition , 2002, Perception & psychophysics.

[73]  J. Mullennix,et al.  Effects of talker variability on recall of spoken word lists. , 1989, Journal of experimental psychology. Learning, memory, and cognition.

[74]  G. V. Van Hoesen,et al.  Prosopagnosia , 1982, Neurology.

[75]  D. Schacter,et al.  Perceptual specificity of auditory priming: implicit memory for voice intonation and fundamental frequency. , 1994, Journal of experimental psychology. Learning, memory, and cognition.

[76]  T. M. Nearey Static, dynamic, and relational properties in vowel perception. , 1989, The Journal of the Acoustical Society of America.

[77]  Peter W. Jusczyk,et al.  Do infants remember voices , 1993 .

[78]  V C Tartter,et al.  Identifiability of vowels and speakers from whispered syllables , 1991, Perception & psychophysics.

[79]  Gordon E. Legge,et al.  Learning unfamiliar voices , 1984 .

[80]  V C Tartter,et al.  Hearing smiles and frowns in normal and whisper registers. , 1994, The Journal of the Acoustical Society of America.

[81]  W. Voiers Perceptual Bases of Speaker Identity , 1964 .

[82]  Bruce F. Pennington,et al.  What do double dissociations prove? , 2001, Cogn. Sci..

[83]  Brian R. Clifford,et al.  Voice identification by human listeners: On earwitness reliability , 1980 .

[84]  F. Craik,et al.  The Effect of Speaker's Voice on Word Recognition , 1974 .

[85]  P. Change Some Thoughts on , 1988 .

[86]  G. A. Miller The Perception of Speech. , 1951 .

[87]  D. Pisoni,et al.  Speech Perception as a Talker-Contingent Process , 1993, Psychological science.

[88]  S. Goldinger Words and voices: episodic traces in spoken word identification and recognition memory. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[89]  D. Lancker,et al.  Familiar voice recognition: Patterns and parameters. Part I. Recognition of backward voices , 1985 .

[90]  Joachim Bodamer,et al.  Die Prosop-Agnosie , 2004, Archiv für Psychiatrie und Nervenkrankheiten.

[91]  A. Decasper,et al.  Of Human Bonding: Newborns Prefer Their Mothers' Voices , 1980 .

[92]  Michael Studdert-Kennedy,et al.  chapter 8 – Speech Perception , 1976 .

[93]  David B. Pisoni,et al.  On the nature of talker variability effects on recall of spoken word lists. , 1991 .

[94]  Joseph H. Greenberg,et al.  Current trends in linguistics. , 1959, Science.

[95]  D Günzburger,et al.  Acoustic and perceptual implications of the transsexual voice , 1995, Archives of sexual behavior.

[96]  Vicki Bruce,et al.  Facial identity and facial speech processing: Familiar faces and voices in the McGurk effect , 1995, Perception & psychophysics.

[97]  V. Mann,et al.  Development of voice recognition: parallels with face recognition. , 1979, Journal of experimental child psychology.

[98]  S. Goldinger,et al.  Episodic encoding of voice attributes and recognition memory for spoken words. , 1993, Journal of experimental psychology. Learning, memory, and cognition.

[99]  Norman J. Lass,et al.  The Effect of Filtered Speech on Speaker Height and Weight Identification. , 1980 .

[100]  G Papcun,et al.  Long-term memory for unfamiliar voices. , 1989, The Journal of the Acoustical Society of America.

[101]  B. Bartholomeus Voice identification by nursery school children. , 1973, Canadian journal of psychology.

[102]  S. Goldinger Echoes of echoes? An episodic theory of lexical access. , 1998, Psychological review.

[103]  D B Pisoni,et al.  Stimulus variability and processing dependencies in speech perception , 1990, Perception & psychophysics.

[104]  A. Young,et al.  Understanding face recognition. , 1986, British journal of psychology.