Phonetic and lexical interferences in informational masking during speech-in-speech comprehension

This study investigates masking effects occurring during speech comprehension in the presence of concurrent speech signals. We examined the differential effects of acoustic-phonetic and lexical content of 4- to 8-talker babble (natural speech) or babble-like noise (reversed speech) on word identification. Behavioral results show a monotonic decrease in speech comprehension rates with an increasing number of simultaneous talkers in the reversed condition. Similar results are obtained with natural speech except for the 4-talker babble situations. An original signal analysis is then proposed to evaluate the spectro-temporal saturation of composite multitalker babble. Results from this analysis show a monotonic increase in spectro-temporal saturation with an increasing number of simultaneous talkers, for both natural and reversed speech. This suggests that informational masking consists of at least acoustic-phonetic masking which is fairly similar in the reversed and natural conditions and lexical masking which is present only with natural babble. Both effects depend on the number of talkers in the background babble. In particular, results confirm that lexical masking occurs only when some words in the babble are detectable, i.e. for a low number of talkers, such as 4, and diminishes with more talkers. These results suggest that different levels of linguistic information can be extracted from background babble and cause different types of linguistic competition for target-word identification. The use of this paradigm by psycholinguists could be of primary interest in detailing the various information types competing during lexical access.

[1]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  W. T. Nelson,et al.  A speech corpus for multitalker communications research. , 2000, The Journal of the Acoustical Society of America.

[3]  A. Duquesnoy Effect of a single interfering noise or speech source upon the binaural sentence intelligibility of aged persons. , 1983, The Journal of the Acoustical Society of America.

[4]  Julien Pinquier,et al.  A fusion study in speech / music classification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[5]  T Houtgast,et al.  A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[6]  D S Brungart,et al.  Informational and energetic masking effects in the perception of two simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[7]  J. L. Danhauer,et al.  Effects of four noise competitors on the California Consonant Test. , 1979, The Journal of speech and hearing disorders.

[8]  Janellen Huttenlocher,et al.  Why does memory span increase with age? , 1976, Cognitive Psychology.

[9]  Marc Brysbaert,et al.  Lexique 2 : A new French lexical database , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[10]  W. Marslen-Wilson Functional parallelism in spoken word-recognition , 1987, Cognition.

[11]  N. Viemeister,et al.  The temporal course of simultaneous tone-on-tone masking. , 1984, The Journal of the Acoustical Society of America.

[12]  M. Cooke,et al.  Consonant identification in N-talker babble is a nonmonotonic function of N. , 2005, The Journal of the Acoustical Society of America.

[13]  A. B.,et al.  SPEECH COMMUNICATION , 2001 .

[14]  H S Colburn,et al.  Speech intelligibility and localization in a multi-source environment. , 1999, The Journal of the Acoustical Society of America.

[15]  S Monsell,et al.  Competitor priming in spoken word recognition. , 1998, Journal of experimental psychology. Learning, memory, and cognition.

[16]  Régine André-Obrecht,et al.  A new statistical approach for the automatic segmentation of continuous speech signals , 1988, IEEE Trans. Acoust. Speech Signal Process..

[17]  A. Marchal,et al.  Speech production and speech modelling , 1990 .

[18]  William D. Marslen-Wilson,et al.  Activation, competition, and frequency in lexical access , 1991 .

[19]  M. Ericson,et al.  Informational and energetic masking effects in the perception of multiple simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[20]  R. Plomp,et al.  Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. , 1990, The Journal of the Acoustical Society of America.

[21]  Britta Wrede,et al.  Modelling the effects of speech rate variation for automatic speech recognition , 2002 .

[22]  Steven Greenberg,et al.  INSIGHTS INTO SPOKEN LANGUAGE GLEANED FROM PHONETIC TRANSCRIPTION OF THE SWITCHBOARD CORPUS , 1996 .

[23]  Anne Cutler,et al.  Competition and segmentation in spoken word recognition , 1994, ICSLP.

[24]  D D Dirks,et al.  Masking effects of speech competing messages. , 1969, Journal of speech and hearing research.

[25]  Steven Greenberg,et al.  The relation between speech intelligibility and the complex modulation spectrum , 2001, INTERSPEECH.

[26]  N. Cowan,et al.  The cocktail party phenomenon revisited: how frequent are attention shifts to one's name in an irrelevant auditory channel? , 1995, Journal of experimental psychology. Learning, memory, and cognition.

[27]  E. C. Cherry Some Experiments on the Recognition of Speech, with One and with Two Ears , 1953 .

[28]  Lorraine K. Tyler,et al.  The Time Course of Activation of Semantic Information during Spoken Word Recognition , 1997 .

[29]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[30]  D. Norris Shortlist: a connectionist model of continuous speech recognition , 1994, Cognition.

[31]  Herbert Schriefers,et al.  Effects of sensory information and processing time in spoken-word recognition , 1995 .

[32]  Perceptual distance and competition in lexical access. , 1996 .

[33]  Irwin Pollack,et al.  Cocktail Party Effect , 1957 .

[34]  K. Saberi,et al.  Cognitive restoration of reversed speech , 1999, Nature.

[35]  N. Cowan,et al.  The cocktail party phenomenon revisited: attention and memory in the classic selective listening procedure of Cherry (1953). , 1995, Journal of experimental psychology. General.

[36]  G. A. Miller The masking of speech. , 1947, Psychological bulletin.

[37]  E. Carterette,et al.  Some Factors Affecting Multi‐Channel Listening , 1954 .

[38]  S. Shamma,et al.  Spectro-temporal modulation transfer functions and speech intelligibility. , 1999, The Journal of the Acoustical Society of America.

[39]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[40]  Competition in spoken word recognition: Spotting words in other words , 1994 .

[41]  Steven Greenberg THE EARS HAVE IT : THE AUDITORY BASIS OF SPEECH PERCEPTION , 1995 .

[42]  Stefan Karnebäck Discrimination between speech and music based on a low frequency modulation feature , 2001, INTERSPEECH.

[43]  D. Pisoni,et al.  Recognizing Spoken Words: The Neighborhood Activation Model , 1998, Ear and hearing.

[44]  A. Bronkhorst,et al.  Multichannel speech intelligibility and talker recognition using monaural, binaural, and three-dimensional auditory presentation. , 2000, The Journal of the Acoustical Society of America.

[45]  R Plomp,et al.  Effect of multiple speechlike maskers on binaural speech recognition in normal and impaired hearing. , 1992, The Journal of the Acoustical Society of America.

[46]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[47]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[48]  D S Brungart Evaluation of speech intelligibility with the coordinate response measure. , 2001, The Journal of the Acoustical Society of America.

[49]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[50]  P. Divenyi,et al.  The Times of Ira Hirsh: Multiple Ranges of Auditory Temporal Perception. , 2004, Seminars in hearing.

[51]  David B. Pisoni,et al.  Similarity neighborhoods of spoken words , 1991 .

[52]  Pierre Divenyi Speech Separation by Humans and Machines , 2004 .

[53]  Alex Brandmeyer,et al.  The “ cocktail-party effect ” and prosodic rhythm : Discrimination of the temporal structure of speech-like sequences in temporal interference , 2003 .

[54]  B Kollmeier,et al.  Directivity of binaural noise reduction in spatial multiple noise-source arrangements for normal and impaired listeners. , 1997, The Journal of the Acoustical Society of America.