Syllabification of Middle Dutch

The study of spelling variation can be seen as a window allowing us to understand the phonological systems of the dialects of Middle Dutch, and to what extent they differed. Syllabic information is of great help in the study of spelling variation, but manual annotation of large corpora is a labor-intensive task. We present a method for automatic syllabification of words in Middle Dutch texts. We adapt an existing method for hyphenating (Modern) Dutch words by modifying the definition of nucleus and onset, and by adding a number of rules for dealing with spelling variation. The method combines a rule-based finite-state component and data-driven error-correction rules. The hyphenation accuracy of the system is 98.4% and word accuracy is 97.4%. We apply the method to a Middle Dutch corpus and show that the resulting annotation allows us to study temporal and regional variation in phonology as reflected in spelling.