Towards a free/open-source universal-dependency treebank for Kazakh

This article describes the first steps towards a free/open-source dependency treebank for Kazakh based on universal dependency (UD) annotation standards. The treebank contains 402 sentences and is based on texts from a range of open-source and public domain sources. This ensures its free availability and extensibility. Texts in the treebank are first morphologically analysed and disambiguated and then annotated manually for dependency structure. In the article we present some issues in dependency syntax for Kazakh and how these are analysed in the universal-dependency framework. Preliminary results for statistical dependency parsing of Kazakh are reported, along with some directions for future research.