How to Tell a Schneemann from a Milchmann: An Annotation Scheme for Compound-Internal Relations

This paper presents a language-independent annotation scheme for the semantic relations that link the constituents of noun-noun compounds, such as Schneemann ‘snow man’ or Milchmann ‘milk man’. The annotation scheme is hybrid in the sense that it assigns each compound a two-place label consisting of a semantic property and a prepositional paraphrase. The resulting inventory combines the insights of previous annotation schemes that rely exclusively on either semantic properties or prepositions, thus avoiding the known weaknesses that result from using only one of the two label types. The proposed annotation scheme has been used to annotate a set of 5112 German noun-noun compounds. A release of the dataset is currently being prepared and will be made available via the CLARIN Center Tubingen. In addition to the presentation of the hybrid annotation scheme, the paper also reports on an inter-annotator agreement study that has resulted in a substantial agreement among annotators.