Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English

it has been well established that word frequency is a very important variable in cognitive processing. Highfrequency words are perceived and produced more quickly and more efficiently than low-frequency words (e.g., Balota & Chumbley, 1984; Jescheniak & Levelt, 1994; Monsell, Doyle, & Haggard, 1989; Rayner & Duffy, 1986). At the same time, high-frequency words are easier to recall but more difficult to recognize in episodic memory tasks (e.g., Glanzer & Bowles, 1976; Yonelinas, 2002). To investigate the word frequency effect, psychologists need estimates of how often words occur in a language. Howes and Solomon (1951), for instance, made use of Thorndike and Lorge’s (1944; hereafter, TL) list of words as counted in books. Subsequently, Ku era and Francis’s (1967; hereafter, KF) frequency norms became the measure of preference and formed the basis of over 40 years of psycholinguistic and memory research in the U.S. The latter may be surprising, because the KF list was based on a corpus of 1.014 million words only, whereas TL was based on a corpus of 18 million words. The reasons why KF became more popular may have been that the texts were more recent (from 1961 vs. the 1920s and 1930s) and were entirely based on adult reading material, whereas TL also contained children’s books. Differences in availability may have played a role as well, in addition to a snowball effect (once KF was used in a number of key articles, it became the measure of choice for the group of researchers working on that topic).

[1]  W. Nelson Francis,et al.  FREQUENCY ANALYSIS OF ENGLISH USAGE: LEXICON AND GRAMMAR , 1983 .

[2]  Françoise Vitu,et al.  Word skipping: Implications for theories of eye movement control in reading , 1998 .

[3]  R. Johnston,et al.  Age of acquisition and lexical processing , 2006 .

[4]  R. Holloway The broth in my brother ’ s brothel : Morpho-orthographic segmentation in visual word recognition , 2005 .

[5]  M. Brysbaert,et al.  The use of film subtitles to estimate word frequencies , 2007, Applied Psycholinguistics.

[6]  W. Hockley The effects of environmental context on recognition memory and claims of remembering. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[7]  Mark S. Seidenberg,et al.  Age of Acquisition Effects in Word Reading and Other Tasks , 2002 .

[8]  B. Underwood Ten years of massed practice on distributed practice. , 1961 .

[9]  M Glanzer,et al.  The mirror effect in recognition memory , 1984, Memory & cognition.

[10]  Michael J Cortese,et al.  Visual word recognition of single-syllable words. , 2004, Journal of experimental psychology. General.

[11]  C. Peirce,et al.  The Fixation of Belief , 2011, Philosophy after Darwin.

[12]  W. Levelt,et al.  Word frequency effects in speech production: Retrieval of syntactic information and of phonological form , 1994 .

[13]  Michael J Cortese,et al.  Age of acquisition predicts naming and lexical-decision performance above and beyond 22 other predictor variables: An analysis of 2,342 words , 2007, Quarterly journal of experimental psychology.

[14]  C. Davis,et al.  Semantic involvement in reading aloud: evidence from a nonword training study. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[15]  Marc Brysbaert,et al.  Lexique 2 : A new French lexical database , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[16]  G. Leech,et al.  Word Frequencies in Written and Spoken English: based on the British National Corpus , 2001 .

[17]  K. Rastle,et al.  The processing of singular and plural nouns in French and English , 2004 .

[18]  Matrhew J Pastizzo,et al.  Spoken word frequency counts based on 1.6 million words in American English , 2007, Behavior research methods.

[19]  H. Clahsen,et al.  Lexical entries and rules of language: A multidisciplinary study of German inflection , 1999, Behavioral and Brain Sciences.

[20]  K. Szpunar,et al.  Testing during study insulates against the buildup of proactive interference. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[21]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[22]  G. Waters,et al.  Reading words aloud-a mega study , 1989 .

[23]  R. Baayen,et al.  Morphological influences on the recognition of monosyllabic monomorphemic words , 2006 .

[24]  Lee H. Wurm,et al.  Lexical dynamics for low-frequency complex words: A regression study across tasks and modalities , 2007 .

[25]  K. Rayner,et al.  The word grouping hypothesis and eye movements during reading. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[26]  M. Brysbaert,et al.  Reexamining the word length effect in visual word recognition: New evidence from the English Lexicon Project , 2006, Psychonomic bulletin & review.

[27]  R. Baayen,et al.  Singulars and plurals in Dutch: Evidence for a parallel dual-route model , 1997 .

[28]  D. Besner,et al.  Reading aloud: qualitative differences in the relation between stimulus quality and word frequency as a function of context. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[29]  R. H. Baayen,et al.  The CELEX Lexical Database (CD-ROM) , 1996 .

[30]  Ian M. McDonough,et al.  Autobiographical elaboration reduces memory distortion: cognitive operations and the distinctiveness heuristic. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[31]  Michael B. Lewis,et al.  Age of acquisition and the cumulative-frequency hypothesis: a review of the literature and a new multi-task investigation. , 2004, Acta psychologica.

[32]  M. Glanzer,et al.  Analysis of the word-frequency effect in recognition memory , 1976 .

[33]  D. Balota,et al.  Are lexical decisions a good measure of lexical access? The role of word frequency in the neglected decision stage. , 1984, Journal of experimental psychology. Human perception and performance.

[34]  F. Pulvermüller,et al.  Effects of word length and frequency on the human event-related potential , 2004, Clinical Neurophysiology.

[35]  Jeffrey M. Zacks,et al.  Pictures of a thousand words: Investigating the neural mechanisms of reading with extremely rapid event-related fMRI , 2008, NeuroImage.

[36]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[37]  Marc Brysbaert,et al.  WordGen: A tool for word selection and nonword generation in Dutch, English, German, and French , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[38]  R. Solomon,et al.  Visual duration threshold as a function of word-probability. , 1951, Journal of experimental psychology.

[39]  Rebecca Treiman,et al.  The English Lexicon Project , 2007, Behavior research methods.

[40]  P. Witty The teacher's word book of 30,000 words. , 1945 .

[41]  K. Rayner,et al.  Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity , 1986, Memory & cognition.

[42]  A. D. Groot,et al.  Disentangling Context Availability and Concreteness in Lexical Decision and W ord Translation , 1998 .

[43]  Michael C. Doyle,et al.  Effects of frequency on visual word recognition tasks: where are they? , 1989, Journal of experimental psychology. General.

[44]  M. Taft Morphological Decomposition and the Reverse Base Frequency Effect , 2004, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[45]  A. Caramazza,et al.  Lexical access and inflectional morphology , 1988, Cognition.

[46]  Gordon D. A. Brown,et al.  Contextual Diversity, Not Word Frequency, Determines Word-Naming and Lexical Decision Times , 2006, Psychological science.

[47]  M. Gaskell,et al.  Lexical competition and the acquisition of novel words , 2003, Cognition.

[48]  D. Titone,et al.  Making sense of word senses: the comprehension of polysemy depends on sense overlap. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[49]  Barbara J. Juhasz,et al.  Age-of-acquisition effects in word and picture identification. , 2005, Psychological bulletin.

[50]  Curt Burgess,et al.  The effect of corpus size in predicting reaction time in a basic word recognition task: Moving on from Kučera and Francis , 1998 .

[51]  A. Yonelinas The Nature of Recollection and Familiarity: A Review of 30 Years of Research , 2002 .

[52]  T. Curran,et al.  Effects of repetition priming on recognition memory: testing a perceptual fluency-disfluency model. , 2008, Journal of experimental psychology. Learning, memory, and cognition.

[53]  Irene V Blair,et al.  Using Internet search engines to estimate word frequency , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.