Abstract The quantitative approach to morphological productivity first proposed by Baayen crucially refers to the relation between the number of hapax legomena formed with a given affix occurring in a sufficiently large corpus and the total number of tokens of that affix sampled in the corpus. Most criticism against this measure focuses on its neglecting the role played by frequency in the evaluation of productivity. As an improvement of Baayen’s procedure, a variable-corpus approach is proposed. Accordingly, the productivity values should be calculated at equal token numbers for different affixes instead of taking the different token numbers which result from sampling the whole corpus for all affixes, as in Baayen's works. This implies that variably-sized subcorpora must be sampled to compare affixes displaying different frequencies. On the basis of a 75-million-token newspaper corpus, the productivity values for several Italian affixes in the deverbal and deadjectival domain are calculated. The resulting rank proves linguistically plausible, avoiding the overestimation of productivity for low-frequency affixes typically occurring in fixed-corpus calculations. As a further advantage, the procedure proposed here makes it possible to deal satisfactorily with two problematic aspects usually neglected in previous investigations, namely, the quantitative impact of (i) allomorphies and lexicalizations and (ii) inner-cycle derivations on productivity measures.
[1]
R. Harald Baayen,et al.
Productivity in language production
,
1994
.
[2]
M. Aronoff,et al.
Producing morphologically complex words
,
1988
.
[3]
HARALD BAAYEN,et al.
Productivity and English derivation: a corpus-based study
,
1991
.
[4]
R. Baayen,et al.
Chronicling the Times: Productive Lexical Innovations in an English Newspaper
,
1996
.
[5]
R. Harald Baayen,et al.
Morphological productivity across speech and writing
,
1999,
English Language and Linguistics.
[6]
Livio Gaeta,et al.
Italian prefixes and productivity: a quantitative approach
,
2003
.