On Interlinking Linked Data Sources by Using Ontology Matching Techniques and the Map-Reduce Framework

Interlinking different data sources has become a crucial task due to the explosion of diverse, heterogeneous information repositories in the so-called Web of Data. In this paper an approach to extract relationships between entities existing in huge Linked Data sources is presented. Our approach hinges on the Map-Reduce processing framework and context-based ontology matching techniques so as to discover the maximum number of possible relationships between entities within different data sources in an computationally efficient fashion. To this end the processing flow is composed by three Map-Reduce jobs in charge for 1) the collection of linksets between datasets; 2) context generation; and 3) construction of entity pairs and similarity computation. In order to assess the performance of the proposed scheme an exemplifying prototype is implemented between DBpedia and LinkedMDB datasets. The obtained results are promising and pave the way towards benchmarking the proposed interlinking procedure with other ontology matching systems.