Pitch modelling for the Nguni languages

Although the complexity of prosody is widely recognised, the lack of widely-accepted descriptive standards for prosodic phenomena has meant that prosodic systems for most of the languages of the world have, at best, been described in impressionistic rule-based terms. For the languages of Southern Africa, the deficiencies in our modelling capabilities are acute. Little work of a quantitative nature has been published for the languages of the Nguni family (such as isiZulu and isiXhosa), and there are significant contradictions and imprecisions in the literature on this topic, which partially stems from the lack of quantitative, measurement-driven analysis. This paper therefore embarks on a programme aimed at understanding the relationship between linguistic and physical variables of a prosodic nature in this family of languages. Firstly we undertake a set of experiments to select an appropriate pitch tracking algorithm for the the Nguni family of languages. We then use this pitch tracking algorithm to extract relevant data from speech recordings to build intonation corpora for isiZulu and isiXhosa. Using the extracted data in the intonation corpus, we show that it is possible to develop fairly accurate intonation models using a neural network classifier for isiZulu and isiXhosa.