TheSoz: A SKOS representation of the thesaurus for the social sciences

The Thesaurus for the Social Sciences TheSoz is a Linked Dataset in SKOS format, which serves as a crucial instrument for information retrieval based on e.g. document indexing or search term recommendation. Thesauri and similar controlled vocabularies build a linking bridge for datasets from the Linked Open Data cloud. In this article the conversion process of the TheSoz to SKOS is described including the analysis of the original dataset and its structure, the mapping to adequate SKOS classes and properties, and the technical conversion. In order to create a semantically full representation of TheSoz in SKOS, extensions based on SKOS-XL had to be defined. These allow the modeling of special relations like compound equivalences and terms with ambiguities. Additionally, mappings to other datasets and the appliance of the TheSoz are presented. Finally, limitations and modeling issues encountered during the creation process are discussed.