Countering language attrition with PanLex and the Web of Data

At present, there are approximately 7,000 living languages in the world. However, some experts claim that the process of globalization may eventually lead to the world losing this linguistic diversity. The vision of the PanLex project is to help save these languages, especially low-density ones, by allowing them to be intertranslatable and thus to be a part of the Information Age. Semantic Web technologies can support achieving this goal, for reasons such as their capabilities of flexibly representing, interlinking and reasoning with data, in our case particularly linguistic resources and annotations. Conversely, an RDF version of PanLex makes a significant contribution towards improving the coverage of the Linguistic Web of Data, as to the best of our knowledge there exists no large scale Linked Data data set for panlingual translation of non-mainstream languages. In this dataset description paper we detail how we transformed the data of the PanLex project to RDF, established conformance with the lemon and GOLD data models, interlinked it with Lexvo and DBpedia, and published it as Linked Data and via SPARQL.