The Quarterly Journal of Experimental Psychology Baldey: a Database of Auditory Lexical Decisions

In an auditory lexical decision experiment, 5541 spoken content words and pseudowords were presented to 20 native speakers of Dutch. The words vary in phonological make-up and in number of syllables and stress pattern, and are further representative of the native Dutch vocabulary in that most are morphologically complex, comprising two stems or one stem plus derivational and inflectional suffixes, with inflections representing both regular and irregular paradigms; the pseudowords were matched in these respects to the real words. The BALDEY (“biggest auditory lexical decision experiment yet”) data file includes response times and accuracy rates, with for each item morphological information plus phonological and acoustic information derived from automatic phonemic segmentation of the stimuli. Two initial analyses illustrate how this data set can be used. First, we discuss several measures of the point at which a word has no further neighbours and compare the degree to which each measure predicts our lexical decision response outcomes. Second, we investigate how well four different measures of frequency of occurrence (from written corpora, spoken corpora, subtitles, and frequency ratings by 75 participants) predict the same outcomes. These analyses motivate general conclusions about the auditory lexical decision task. The (publicly available) BALDEY database lends itself to many further analyses.

[1]  J. Mehler,et al.  Monitoring the lexicon with normal and compressed speech: Frequency effects and the prelexical code. , 1990 .

[2]  Marcus Taft,et al.  Lexical access codes in visual and auditory word recognition , 1986 .

[3]  P. Luce,et al.  A computational analysis of uniqueness points in auditory word recognition , 1986, Perception & psychophysics.

[4]  Nelleke Oostdijk,et al.  The Design of the Spoken Dutch Corpus , 2002 .

[5]  R. Baayen,et al.  Frequency effects in regular inflectional morphology: Revisiting Dutch plurals , 2003 .

[6]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[7]  Anne Cutler,et al.  Competition and segmentation in spoken word recognition , 1994, ICSLP.

[8]  J. Vroomen,et al.  Metrical segmentation and lexical inhibition in spoken word recognition , 1995 .

[9]  Rebecca Treiman,et al.  The English Lexicon Project , 2007, Behavior research methods.

[10]  R. Shillcock,et al.  The recognition of words after their acoustic offsets in spontaneous speech: Effects of subsequent context , 1988, Perception & psychophysics.

[11]  F. Grosjean The recognition of words after their acoustic offset: Evidence and implications , 1985, Perception & psychophysics.

[12]  Mirjam Ernestus,et al.  Effects of word frequency on the acoustic durations of affixes , 2006, INTERSPEECH.

[13]  William D. Marslen-Wilson,et al.  Integrating Form and Meaning: A Distributed Model of Speech Perception. , 1997 .

[14]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[15]  Steve Young,et al.  The HTK book , 1995 .

[16]  William D Marslen-Wilson,et al.  Processing interactions and lexical access during word recognition in continuous speech , 1978, Cognitive Psychology.

[17]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[18]  W Marslen-Wilson,et al.  Levels of perceptual representation and process in lexical access: words, phonemes, and features. , 1994, Psychological review.

[19]  Donald Eugene. Farrar,et al.  Multicollinearity in Regression Analysis; the Problem Revisited , 2011 .

[20]  M. Coltheart,et al.  The quarterly journal of experimental psychology , 1985 .

[21]  S. Goldinger,et al.  Priming Lexical Neighbors of Spoken Words: Effects of Competition and Inhibition. , 1989, Journal of memory and language.

[22]  R. Harald Baayen,et al.  Morphological structure in language processing , 2003 .

[23]  P. Luce,et al.  Examining the time course of indexical specificity effects in spoken word recognition. , 2005, Journal of experimental psychology. Learning, memory, and cognition.

[24]  S. Goldinger Auditory Lexical Decision , 1996 .

[25]  M. Taft,et al.  Exploring the cohort model of spoken word recognition , 1986, Cognition.

[26]  Michael J Cortese,et al.  Do the effects of subjective frequency and age of acquisition survive better word frequency norms? , 2011, Quarterly journal of experimental psychology.

[27]  Paul D. Allopenna,et al.  Tracking the Time Course of Spoken Word Recognition Using Eye Movements: Evidence for Continuous Mapping Models , 1998 .

[28]  Marc Brysbaert,et al.  SUBTLEX-NL: A new measure for Dutch word frequency based on film subtitles , 2010, Behavior research methods.

[29]  D. Norris Shortlist: a connectionist model of continuous speech recognition , 1994, Cognition.

[30]  T. Jaeger,et al.  Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. , 2008, Journal of memory and language.

[31]  Mirjam Ernestus,et al.  Corpora and exemplars in phonology , 2011 .

[32]  Douglas C. Montgomery,et al.  Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics) , 2007 .

[33]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[34]  D. Pisoni,et al.  Recognizing Spoken Words: The Neighborhood Activation Model , 1998, Ear and hearing.

[35]  I. Jolliffe A Note on the Use of Principal Components in Regression , 1982 .

[36]  R. Baayen,et al.  Prefix stripping re-revisited , 1994 .

[37]  Louisa M. Slowiaczek,et al.  Effects of phonological similarity on priming in auditory lexical decision , 1986, Memory & cognition.

[38]  Laura Winther Balling,et al.  Probability and surprisal in auditory comprehension of morphologically complex words , 2012, Cognition.

[39]  Lee H. Wurm,et al.  What residualizing predictors in regression analyses does (and what it does not do) , 2014 .

[40]  Anne Cutler,et al.  Lexical influence in phonetic decision-making: Evidence from subcategorical mismatches , 1999 .

[41]  Morris Halle,et al.  The rules of language , 1980, IEEE Transactions on Professional Communication.

[42]  Elizabeth A. Peck,et al.  Introduction to Linear Regression Analysis , 2001 .

[43]  J. Mullennix,et al.  Word familiarity and frequency in visual and auditory word recognition. , 1990, Journal of experimental psychology. Learning, memory, and cognition.

[44]  Competition in spoken word recognition: Spotting words in other words , 1994 .

[45]  James M. McQueen,et al.  Eight questions about spoken-word recognition , 2007 .

[46]  Michael J. Pazzani,et al.  A Principal Components Approach to Combining Regression Estimates , 1999, Machine Learning.

[47]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[48]  J. Huttenlocher,et al.  Do we know how people identify spoken words , 1988 .

[49]  L. Feldman Modeling Morphological Processing , 2013 .

[50]  W. Marslen-Wilson SPEECH UNDERSTANDING AS A PSYCHOLOGICAL PROCESS , 1980 .