An Evolutionary Analysis of DBpedia Datasets

Linked Data, a method to publish interrelated data on the Semantic Web, has rapidly developed in recent years due to new techniques which enhance the availability of knowledge. As one of the most important central hubs of Linked Data, DBpedia is a large crowd-sourcing encyclopedia that contains diverse and multilingual knowledge from various domains in terms of RDF. Existing research has mostly focused on the basic characteristics of a specific version of the DBpedia datasets. Currently, we are not aware of any evolutionary analysis to understand the changes of DBpedia versions comprehensively. In this paper, we first present an overall evolutionary analysis in graph perspective. The evolution of DBpedia has been clarified based on the comparison of 6 versions of the datasets. Then we select two specific domains as subgraphs and calculate a series of metrics to illustrate the changes. Additionally, we carry out an evolutionary analysis of the interlinks between DBpedia and other Linked Data resources. According to our analysis, we find that although the growth of knowledge in DBpedia is an overall trend in recent five years, there does exist quite a few counter-intuitive results.