TreeTalk-D : a Machine Learning Approach to Dutch Word Pronunciation

We present experimental results concerning the application of the IGTree decision-tree learning algorithm to Dutch word pronunciation. We evaluate four diierent Dutch word pronunciation systems conngured to test the utility of modularization of grapheme{to{phoneme transcription (G) and stress prediction (S). Both training and testing data are extracted from the CELEX II lexical database. Experiments yield full word transcription accuracies (stressed and syllabiied phonetic transcription) of roughly 75%, and 97% accuracy on G at the letter level. The best system performs G and S in sequence, using a context of four letters left and right per grapheme{phoneme mapping.