Endangered Uralic Languages and Language Technologies

Language tools and resources for analysis of less-elaborated languages are in the focus of our workshop. There are still research tracks which still do not sufficiently and effectively exploit language technology solutions, and there are many languages for which the available tools and resources still have to be developed to serve as a basis of further applications. The presentation introduces a set of morphological tools for small and endangered Uralic languages. Various Hungarian research groups specialized in Finno-Ugric linguistics and a Hungarian language technology company (MorphoLogic) have initiated a project with the goal of producing annotated electronic corpora and computational morphological tools for small Uralic languages, like Mordvin, Udmurt (Votyak), Komi (Zyryan), Mansi (Vogul), Khanty (Ostyak), Nenets (Yurak) and Nganasan (Tavgi). Altogether around a dozen Uralic languages totaling some 3.3 million live as scattered minorities in Russia, as shown by the map below: