Modeling Derivational Morphology in Ukrainian

We report on a study applying compositional distributional semantic models (CDSMs) to a set of Ukrainian derivational patterns. Ukrainian is an interesting language as it is morphologically rich, and low-resource. Our study aims at resolving inconsistent results from previous studies which employed CDSMs for derivation; we provide evidence for a cross-lingual advantage of CBOW over NMF representations, as well as a simple additive over a lexical function model. In addition, we present two case studies in which we test the capabilities of CDSMs to deal with pattern-level ambiguity and apply the same CDSMs to inflectional patterns.

[1]  Jan Snajder DerivBase.hr: A High-Coverage Derivational Morphology Resource for Croatian , 2014, LREC.

[2]  Karel Pala,et al.  Derivational Relations in Czech WordNet , 2007, ACL 2007.

[3]  Jan Snajder,et al.  Obtaining a Better Understanding of Distributional Models of German Derivational Morphology , 2015, IWCS.

[4]  Georgiana Dinu,et al.  DISSECT - DIStributional SEmantics Composition Toolkit , 2013, ACL.

[5]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[6]  Jan Snajder,et al.  Predictability of Distributional Semantics in Derivational Word Formation , 2016, COLING.

[7]  Maciej Piasecki,et al.  Recognition of Polish Derivational Relations Based on Supervised Learning Scheme , 2012, LREC.

[8]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[9]  Serge Sharoff,et al.  Ukrainian part-of-speech tagger for hybrid MT: Rapid induction of morphological disambiguation resources from a closely related language , 2016 .

[10]  Thomas Eckart,et al.  Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages , 2012, LREC.

[11]  Marco Baroni,et al.  Frege in Space: A Program for Composition Distributional Semantics , 2014, LILT.

[12]  Sabine Schulte im Walde,et al.  Improving Zero-Shot-Learning for German Particle Verbs by using Training-Space Restrictions and Local Scaling , 2016, *SEM@ACL.

[13]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[14]  Ingo Plag,et al.  Word-Formation in English , 2018 .

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  R. H. Baayen,et al.  The CELEX Lexical Database (CD-ROM) , 1996 .

[17]  Marco Marelli,et al.  Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics , 2013, ACL.