Vocal Imitations and the Identification of Sound Events

It is commonly observed that a speaker vocally imitates a sound that she or he intends to communicate to an interlocutor. We report on an experiment that examined the assumption that vocal imitations can effectively communicate a referent sound and that they do so by conveying the features necessary for the identification of the referent sound event. Participants were required to sort a set of vocal imitations of everyday sounds. The resulting clusters corresponded in most of the cases to the categories of the referent sound events, indicating that the imitations enabled the listeners to recover what was imitated. Furthermore, a binary decision tree analysis showed that a few characteristic acoustic features predicted the clusters. These features also predicted the classification of the referent sounds but did not generalize to the categorization of other sounds. This showed that, for the speaker, vocally imitating a sound consists of conveying the acoustic features important for recognition, within the constraints of human vocal production. As such vocal imitations prove to be a phenomenon potentially useful to study sound identification.

[1]  William W. Gaver How Do We Hear in the World?: Explorations in Ecological Acoustics , 1993 .

[2]  R. Zuchowski,et al.  Stops and Other Sound-Symbolic Devices Expressing the Relative Length of Referent Sounds in Onomatopoeia , 1998 .

[3]  Norman J. Lass,et al.  Listeners' Identification of Human-Imitated Animal Sounds , 1983 .

[4]  Stephen McAdams,et al.  Hearing living symbols and nonliving icons: Category specificities in the cognitive processing of environmental sounds , 2010, Brain and Cognition.

[5]  S. Handel,et al.  Listening: An Introduction to the Perception of Auditory Events , 1993 .

[6]  J. Townsend,et al.  Auditory semantic networks for words and natural sounds , 2006, Brain Research.

[7]  Shin-ichiro Iwamiya,et al.  Relationships between auditory impressions and onomatopoeic features for environmental sounds , 2003 .

[8]  S. Winsberg,et al.  A multidimentional technique for sound quality assessment , 1999 .

[9]  P A Cabe,et al.  Human sensitivity to acoustic information from vessel filling. , 2000, Journal of experimental psychology. Human perception and performance.

[10]  Davide Rocchesso,et al.  Sounding Objects , 2003, IEEE Multim..

[11]  Shrikanth S. Narayanan,et al.  Vector-based Representation and Clustering of Audio Using Onomatopoeia Words , 2006, AAAI Fall Symposium: Aurally Informed Performance.

[12]  Hiroshi G. Okuno,et al.  Automatic transformation of environmental sounds into sound-imitation words based on Japanese syllable structure , 2003, INTERSPEECH.

[13]  P. Boersma Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[14]  Shin-ichiro Iwamiya,et al.  Comparisons of Auditory Impressions and Auditory Imagery Associated with Onomatopoeic Representation for Environmental Sounds , 2010, EURASIP J. Audio Speech Music. Process..

[15]  Y. Escoufier LE TRAITEMENT DES VARIABLES VECTORIELLES , 1973 .

[16]  R. Caussé,et al.  The representation of auditory source characteristics: simple geometric form. , 1996, Perception & psychophysics.

[17]  Davide Rocchesso,et al.  Integration of acoustical information in the perception of impacted sound sources: the role of information accuracy and exploitability. , 2010, Journal of experimental psychology. Human perception and performance.

[18]  R. Lutfi,et al.  Auditory discrimination of material changes in a struck-clamped bar. , 1997, The Journal of the Acoustical Society of America.

[19]  Bruno L. Giordano,et al.  Material identification of real impact sounds: effects of size variation in steel, glass, wood, and plexiglass plates. , 2006, The Journal of the Acoustical Society of America.

[20]  C. Carello,et al.  Perception of Object Length by Sound , 1998 .

[21]  Guillaume Lemaitre,et al.  Auditory perception of material is fragile while action is strikingly robust. , 2012, The Journal of the Acoustical Society of America.

[22]  S. Handel,et al.  Chapter 12 – Timbre Perception and Auditory Object Identification , 1995 .

[23]  S. McAdams,et al.  The psychomechanics of simulated sound sources: material properties of impacted bars. , 2004, The Journal of the Acoustical Society of America.

[24]  William W. Gaver What in the World Do We Hear? An Ecological Approach to Auditory Event Perception , 1993 .

[25]  Judy Edworthy,et al.  What determines auditory similarity? The effect of stimulus group and methodology , 2009, Quarterly journal of experimental psychology.

[26]  Stephen McAdams,et al.  The psychomechanics of simulated sound sources: material properties of impacted thin plates. , 2010, The Journal of the Acoustical Society of America.

[27]  Tomohiro Nakatani,et al.  Automatic Sound-Imitation Word Recognition from Environmental Sounds Focusing on Ambiguity Problem in Determining Phonemes , 2004, PRICAI.

[28]  Dominique Valentin,et al.  Analyzing assessors and products in sorting tasks: DISTATIS, theory and applications , 2007 .

[29]  A. de Cheveigné,et al.  The dependency of timbre on fundamental frequency. , 2003, The Journal of the Acoustical Society of America.

[30]  Valerie L. Shafer,et al.  6. Speech perception in second language learners: The re-education of selective perception , 2008 .

[31]  Aniruddh D. Patel,et al.  Acoustic and Perceptual Comparison of Speech and Drum Sounds in the North Indian Tabla Tradition: An Empirical Study of Sound Symbolism , 2003 .

[32]  M. Turvey,et al.  Hearing shape. , 2000, Journal of experimental psychology. Human perception and performance.

[33]  Dik J. Hermes,et al.  Perception of the size and speed of rolling balls by sound , 2004, Speech Commun..

[34]  G. Vigliocco,et al.  What do English speakers know about gera-gera and yota-yota?: A cross-linguistic investigation of mimetic words for laughing and walking , 2007 .

[35]  J. Ballas Common factors in the identification of an assortment of brief everyday sounds. , 1993, Journal of experimental psychology. Human perception and performance.

[36]  S. Winsberg,et al.  A Multidimensional Technique for Sound Quality . A . ssessment , 2008 .

[37]  Chengqi Zhang,et al.  PRICAI 2004: Trends in Artificial Intelligence , 2004, Lecture Notes in Computer Science.

[38]  W H Warren,et al.  Auditory perception of breaking and bouncing events: a case study in ecological acoustics. , 1984, Journal of experimental psychology. Human perception and performance.

[39]  J. H. Howard,et al.  Interpreting the Language of Environmental Sounds , 1987 .

[40]  Guillaume Lemaitre,et al.  Listener expertise and sound identification influence the categorization of environmental sounds. , 2010, Journal of experimental psychology. Applied.

[41]  A Parducci,et al.  The category effect with rating scales: number of categories, number of stimuli, and method of presentation. , 1986, Journal of experimental psychology. Human perception and performance.

[42]  Guillaume Lemaitre,et al.  Do vocal imitations enable the identification of the imitated sounds , 2009 .

[43]  Laurie M. Heller,et al.  When sound effects are better than the real thing , 2002 .

[44]  Davide Rocchesso,et al.  The Sounding Object , 2002 .

[45]  M. Grassi Do we hear size or sound? Balls dropped on plates , 2005, Perception & psychophysics.

[46]  Gaël Richard,et al.  Drum Loops Retrieval from Spoken Queries , 2005, Journal of Intelligent Information Systems.

[47]  Norman J. Lass,et al.  Listeners' Identification of Environmental Sounds , 1982 .

[48]  Shrikanth S. Narayanan,et al.  Classification of sound clips by two schemes: Using onomatopoeia and semantic labels , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[49]  Stephen McAdams,et al.  The Sound Quality of Car Horns : Designing New Representative Sounds , 2009 .

[50]  Masataka Goto,et al.  VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION , 2009 .

[51]  N J Lass,et al.  Listeners' Discrimination of Real and Human-Imitated Animal Sounds , 1984, Perceptual and motor skills.

[52]  S. Handel Listening As Introduction to the Perception of Auditory Events , 1989 .

[53]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[54]  Masataka Goto,et al.  A Drum Pattern Retrieval Method by Voice Percussion , 2004, ISMIR.

[55]  Patricia Wright Linguistic Description of Auditory Signals. , 1971 .

[56]  Masato Taira,et al.  The neural mechanism associated with the processing of onomatopoeic sounds , 2006, NeuroImage.

[57]  J H Howard,et al.  Syntactic and semantic factors in the classification of nonspeech transient patterns , 1980, Perception & psychophysics.

[58]  Dinesh K. Pai,et al.  Perception of Material from Contact Sounds , 2000, Presence: Teleoperators & Virtual Environments.

[59]  Effects of Context on the Identification of Everyday Sounds , 1991 .

[60]  Stephen McAdams,et al.  The sound quality of car horns: A psychoacoustical study of timbre , 2007 .

[61]  Inger Ekman,et al.  Using vocal sketching for designing sonic interactions , 2010, Conference on Designing Interactive Systems.