This paper describes automatic treatment of multi-word expressions in a morphologically complex flective language – Estonian. It focuses on a special type of multi-word expressions – the verbal multi-word expressions that can function as predicates. Authors describe two language resources – a database of verbal multi-word expressions and a corpus where these items have been annotated manually. The analysis of the annotated corpus demonstrates that the Estonian verbal multi-word expressions alternate in several grammatical categories. Different types of the verbal multi-word expressions (opaque and transparent idioms, support verb constructions and collocations) behave differently in the corpus with regard to the freedom of alternation. The paper describes main types of these alternations and the methods for dealing with them automatically.
[1]
Stefan Evert,et al.
The Statistics of Word Cooccur-rences: Word Pairs and Collocations
,
2004
.
[2]
Cornelius Hasselblatt,et al.
Das estnische Partikelverb als Lehnübersetzung aus dem Deutschen
,
1990
.
[3]
Kadri Muischnek.
Inconsistent Selectional Criteria in Semi-automatic Multi-word Unit Extraction
,
2003
.
[4]
Olatz Ansa,et al.
Representation and Treatment of Multiword Expressions in Basque
,
2004
.
[5]
Susanne Z. Riehemann,et al.
A constructional approach to idioms and word formation
,
2001
.
[6]
Kemal Oflazer,et al.
Integrating Morphology with Multi-word Expression Processing in Turkish
,
2004
.