Perception of Phonemically Ambiguous Spoken Sequences in French

Perception of Phonemically Ambiguous Spoken Sequences in French Anne-Laure Schaegis (al.schaegis@infonie.fr) Elsa Spinelli (elsa.spinelli@upmf-grenoble.fr) Universite Pierre Mendes France Laboratoire de Psychologie et NeuroCognition BP 48, 38040 Grenoble Cedex 9, FRANCE Pauline Welby (welby@icp.inpg.fr) Institut de la Communication Parlee, CNRS UMR 5009 Institut National Polytechnique de Grenoble, Universite Stendhal 46, avenue Felix Viallet, 38031 Grenoble Cedex 1, FRANCE Abstract Because the speech signal is continuous, listeners must segment the speech stream in order to recognize words. Due to elision, some spoken utterances in French are phonemically ambiguous (e.g., C’est l’affiche ‘It’s the poster’ vs. C’est la fiche ‘It’s the sheet’, both [selafiS]), and correct segmentation is necessary for recognition and comprehension. The aim of this study was to assess if listeners discriminate and identify such phonemically ambiguous utterances. In Experiments 1 and 2, an ABX paradigm was used for a discrimination task. The observed accuracy shows that listeners succeeded in discriminating between the two ambiguous stimuli, with identical or different tokens of those stimuli. In Experiment 3, in a forced choice task, listeners were able to retrieve the correct segmentation and correctly identify such ambiguous stimuli. Acoustic analyses have identified some of the acoustic differences between members of the pairs (l’affiche vs. la fiche). These differences are likely to be used by listeners during word segmentation. Keywords : Segmentation, Speech, Word Recognition. Introduction Unlike written language, where words are separated by blank spaces, there are no clear word boundaries in spoken language. This means that a given stretch of speech can be consistent with multiple lexical hypotheses, and that these hypotheses can begin at different points in the input. In processing the speech stream, the listener is therefore routinely confronted with temporary ambiguities. Thus in the French son chat potele [s•)S apotle] ‘his/her plump cat’, the recognition system must select between competing hypotheses like son chat [s•)S a]… ‘his/her cat’ and son chapeau [s•)S apo]… ‘his/her hat’ which, to a first approximation, are equally supported by segmental information. Hence, in order to recognize words in spoken language, listeners must segment the speech stream into discrete word units. How do listeners accomplish this task? Segmentation could be based on several sources of information contained in the speech signal. Listeners could then exploit regularities associated with word beginnings and ends in segmenting the speech stream. One source of information could come from language-specific metrical structure. In English, for example, most content words start with a strong syllable. Cutler and Norris (1988) proposed the Metrical Segmentation Strategy (MSS), according to which listeners exploit such prosodic probabilities to segment speech. According to the MSS, lexical access is initiated at each strong syllable the listener encounters. In a word-spotting experiment, Cutler and Norris (1988) showed that CVCC words like mint [mInt] are easier to detect in Strong-Weak sequences (e.g., [mIn.t?f], where the first syllable is stressed) than in Strong-Strong sequences (e.g., [mIn.tef], where both syllables are stressed). In the latter sequences, detection was hypothesized to be slowed by misalignment with the syllabic boundary before the second stressed syllable. Other prosodic cues have been shown to play a role in segmentation. For example, in French, the last syllable of a prosodic phrase is lengthened and has special prominence. Bacri and Banel (1994) showed that listeners could exploit this pattern in word segmentation. Given ambiguous sequences [ba.gaZ] (bagage ‘luggage’ or bas gage ‘low pledge’), listeners were more likely to hear one word (bagage) when the second syllable was lengthened and two words (bas gage) when the first was lengthened. This finding corresponds to the expectation that a phrase-final syllable will be lengthened (and that a phrase boundary will not occur in the middle of a word). More recently, Welby (2003a/b) showed that French listeners could use the presence of an optional rise in fundamental frequency (f 0 ) or even a simple “elbow” in the f 0 curve as a cue to a content word beginning. Listeners interpreted nonsense sequences like [me.la.m•) .din] as a single nonword melamondine when the f 0 rise began at the first syllable ([me]), and two words when it began at the second syllable ([la]) mes lamondines ‘my lamondines’. Another source of information could come from the phonotactic rules of a given language. For example, if a certain phone sequence is not a possible syllable onset cluster in a given language (e.g., [mO] in French or [mr] in

[1]  W Marslen-Wilson,et al.  Levels of perceptual representation and process in lexical access: words, phonemes, and features. , 1994, Psychological review.

[2]  L. Nakatani,et al.  Locus of segmental cues for word juncture. , 1977, The Journal of the Acoustical Society of America.

[3]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[4]  Hugo Queué Durational cues for word segmentation Dutch , 1992 .

[5]  Ulrich Hans Frauenfelder,et al.  A la recherche d’indices de frontière lexicale dans la resyllabation , 2002 .

[6]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[7]  P. Keating,et al.  Articulatory strengthening at edges of prosodic domains. , 1997, The Journal of the Acoustical Society of America.

[8]  D. K. Oller,et al.  The effect of position in utterance on speech segment duration in English. , 1973, The Journal of the Acoustical Society of America.

[9]  Boris New,et al.  Une base de données lexicales du français contemporain sur internet: LEXIQUE , 2001 .

[10]  Taehong Cho,et al.  Articulatory and acoustic studies on domain-initial strengthening in Korean , 2001, J. Phonetics.

[11]  Cécile Fougeron,et al.  Articulatory properties of initial segments in several prosodic constituents in French , 2001, J. Phonetics.

[12]  Pauline Welby French intonational rises and their role in speech seg mentation [sic] , 2003, INTERSPEECH.

[13]  S. Blumstein,et al.  The effect of subphonetic differences on lexical access , 1994, Cognition.

[14]  Anne Cutler,et al.  The role of strong syllables in segmentation for lexical access , 1988 .

[15]  Nicole Bacri,et al.  On metrical patterns and lexical parsing in French , 1994, Speech Communication.

[16]  A. Cutler,et al.  Processing resyllabified words in French , 2003 .

[17]  J. McQueen Segmentation of Continuous Speech Using Phonotactics , 1998 .

[18]  P C Gordon,et al.  Lexical and prelexical influences on word segmentation: evidence from priming. , 1995, Journal of experimental psychology. Human perception and performance.

[19]  Pauline Susan Welby,et al.  The Slaying of Lady Mondegreen, being a Study of French Tonal Association and Alignment and their Role in Speech Segmentation , 2003 .