论文信息 - Generating examples of paths summarizing RDF datasets

Generating examples of paths summarizing RDF datasets

As datasets become too large to be comprehended directly, a need for data summarization arises. A data summary can present typical patterns commonly found in a dataset, from which high-level understanding of the data can be obtained. Nonetheless, such abstract understanding can be improved by providing concrete examples of the summary patterns. If possible, the chosen examples should be diverse and representative of the patterns they instantiate. In this paper, we present three methods for generating examples of patterns discovered in RDF datasets. The patterns we consider are the most frequent path graphs that consist of classes of instances or data types of literals connected by RDF properties. We propose an RDF/S vocabulary for describing these path graphs and their instances. We present three methods for generating path examples, namely random, distinct, and representative selection, that are based on randomization, diversification, and clustering.

Paolo Tomeo | Vojtech Svátek | Marek Dudás | Jindrich Mynarz

[1] P. Tseng,et al. Statistical Data Analysis Based on the L1-Norm and Related Methods , 2002 .

[2] J. Gross,et al. Graph Theory and Its Applications , 1998 .

[3] Michael Hausenblas,et al. Describing linked datasets with the VoID vocabulary , 2011 .

[4] Hae-Sang Park,et al. A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[5] Saul Vargas,et al. Novelty and Diversity in Recommender Systems , 2015, Recommender Systems Handbook.

[6] Dimitris Plexousakis,et al. RDF Digest: Efficient Summarization of RDF/S KBs , 2015, ESWC.

[7] Dimitris Plexousakis,et al. Ontology understanding without tears: The summarization approach , 2017, Semantic Web.

[8] Evaggelia Pitoura,et al. Comparing Diversity Heuristics , 2009 .

[9] Felix Naumann,et al. Creating voiD descriptions for Web-scale data , 2011, J. Web Semant..

[10] Vojtech Svátek,et al. Dataset Summary Visualization with LODSight , 2015, ESWC.

[11] Lora Aroyo,et al. Extracting Core Knowledge from Linked Data , 2011, COLD.