Towards a tabular open data search engine for public sector information

Public Sector Information (PSI) scenarios require tools that support retrieval of tabular open data beyond keyword-based search on metadata. This paper presents a novel interface for searching tabular open data, as well as a search engine that retrieves tabular data by considering table contents apart from metadata. Our search engine uses word embeddings to calculate the semantic similarity between tabular open data, providing a ranking of candidate tabular datasets to be integrated with an input query table according to the different intentions of the open data reuser (e.g. column or row extension as well as data completion). An initial set of experiments have been conducted, showing promising results in this task.