Most of datasets in open government data portals are mainly in tabular format in spreadsheet, e.g. CSV and XLS. To increase the value and reusability of these datasets, the datasets should be made available in RDF format that can support better data querying and data integration. Our previous work proposed a semi-automatic framework for generating RDF datasets from existing datasets in tabular format. In this paper, we extend our framework to support automatic linking of the RDF datasets. One of the important steps is mapping some literal values that appear in a dataset to some standard URIs. Several previous researches use semantic search API such as DBpedia or Sindice for URI mapping. However, this approach is not appropriate for the datasets of Thailand open data portal (Data.go.th) because there is insufficient data for Thai name entities. In addition, a name may match with more than one URI, i.e. word ambiguity. For example, the name “Bangkok” may match with those referenced by URIs of a province, a hospital or a university. To resolve these issues, our framework proposes that finding semantic types is essential to resolve word ambiguity in retrieving a proper URI for a name entity. This paper presents a framework for finding semantic types and mapping name entities to URIs, i.e. URI lookup. A Name Entity Recognition (NER) technique is applied in finding semantic type of a column in a CSV dataset. The results are used for creating ontology and RDF data that include the URI mappings for name entities. We evaluate two approaches by comparing the performance of a semantic search API, i.e. Wikipedia and the NER technique using some datasets from the Data.go.th website.
[1]
Rik Van de Walle,et al.
Lightweight Transformation of Tabular Open Data to RDF
,
2012,
I-SEMANTICS.
[2]
Timothy W. Finin,et al.
Using Linked Data to Interpret Tables
,
2010,
COLD.
[3]
Thepchai Supnithi,et al.
A Community-Driven Approach to Development of an Ontology-Based Application Management Framework
,
2012,
JIST.
[4]
Wirote Aroonmanakun,et al.
Thai named entity recognition based on conditional random fields
,
2009,
2009 Eighth International Symposium on Natural Language Processing.
[5]
Thepchai Supnithi,et al.
Semi-automatic Framework for Generating RDF Dataset from Open Data
,
2016
.
[6]
Thepchai Supnithi,et al.
RDF Dataset Management Framework for Data.go.th
,
2015,
KICSS.
[7]
Vassilios Peristeras,et al.
Re-using Cool URIs: Entity Reconciliation Against LOD Hubs
,
2011,
LDOW.
[8]
Sören Auer,et al.
User-driven semantic mapping of tabular data
,
2013,
I-SEMANTICS '13.
[9]
Kristina Lerman,et al.
Semi-automatically Mapping Structured Sources into the Semantic Web
,
2012,
ESWC.