Generating examples of paths summarizing RDF datasets

As datasets become too large to be comprehended directly, a need for data summarization arises. A data summary can present typical patterns commonly found in a dataset, from which high-level understanding of the data can be obtained. Nonetheless, such abstract understanding can be improved by providing concrete examples of the summary patterns. If possible, the chosen examples should be diverse and representative of the patterns they instantiate. In this paper, we present three methods for generating examples of patterns discovered in RDF datasets. The patterns we consider are the most frequent path graphs that consist of classes of instances or data types of literals connected by RDF properties. We propose an RDF/S vocabulary for describing these path graphs and their instances. We present three methods for generating path examples, namely random, distinct, and representative selection, that are based on randomization, diversification, and clustering.