Czech Morphological Tagset Revisited
暂无分享,去创建一个
Lot of natural language processing is built on top of some
solid morphological annotation. In this paper we present an
update of the Czech morphological tagset as given by the
analyzer Ajka that has been used for academic as well as
commercial purposes for more than dozen years. The revision
reacts on rather practical issues that we had to face during
development of subsequent tools for NLP, parsers in the first
place. We describe the reasoning behind each of the changes and
include the full updated tagset reference manual. Finally we
provide a comparison and mapping to the Universal tagset as
produced by Google.
[1] Pavel Smerk. Fast Morphological Analysis of Czech , 2009, RASLAN.
[2] Karel Pala,et al. DESAM - Annotated Corpus for Czech , 1997, SOFSEM.
[3] Slav Petrov,et al. A Universal Part-of-Speech Tagset , 2011, LREC.