论文信息 - Lexical embedding in spoken dutch

Lexical embedding in spoken dutch

A stretch of speech is often consistent with multiple words, e.g., the sequence /haem/ is consistent with ‘ham’ but also with the first syllable of ‘hamster’, resulting in temporary ambiguity. However, to what degree does this lexical embedding occur? Analyses on two corpora of spoken Dutch showed that 11.9%-19.5% of polysyllabic word tokens have word-initial embedding, while 4.1%-7.5% of monosyllabic word tokens can appear word-initially embedded. This is much lower than suggested by an analysis of a large dictionary of Dutch. Speech processing thus appears to be simpler than one might expect on the basis of statistics on a dictionary.

Odette Scharenborg | Stefanie Okolowski | O. Scharenborg | Stefanie Okolowski

[1] Anne Cutler,et al. Words within words: lexical statistics and lexical access , 1992, ICSLP.

[2] Ulrich Hans Frauenfelder,et al. Lexical alignment and activation in spoken word recognition , 1991 .

[3] Matthew H. Davis,et al. Leading Up the Lexical Garden Path: Segmentation and Ambiguity in Spoken Word Recognition , 2002 .

[4] Diana Binnenpoorte,et al. The IFA corpus: a phonemically segmented dutch "open source" speech database , 2001, INTERSPEECH.

[5] Anne Cutler,et al. Words within words in a real-speech corpus , 1994 .

[6] Robert Schreuder,et al. Prosodic cues for morphological complexity in Dutch and English , 2005 .

[7] Anne Pier Salverda,et al. The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension , 2003, Cognition.

[8] Lou Boves,et al. Experiences from the Spoken Dutch Corpus Project , 2002, LREC.