In this paper RDFStats is introduced, which is a generator for statistics of RDF sources like SPARQL endpoints and RDF documents. RDFStats does not only provide a statistics generator, but also a powerful API for persisting and accessing statistics including several estimation functions that also support SPARQL filter-like expressions. For many Semantic Web applications like the Semantic Web Integrator and Query Engine (SemWIQ), which is currently developed at the University of Linz, detailed statistics about the contents of RDF data sources are very important. RDFStats has been primarily designed and implemented for the SemWIQ federator and optimizer, but it can also be used for other applications like linked data browsers, aggregators, or visualization tools. It is based on the popular Semantic Web framework Jena developed by HP Labs Bristol and can be easily extended and integrated into other applications.
[1]
Timothy W. Finin,et al.
Swoogle: a search and metadata engine for the semantic web
,
2004,
CIKM '04.
[2]
Robert Kooi,et al.
The Optimization of Queries in Relational Databases
,
1980
.
[3]
Setsuo Ohsuga,et al.
INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES
,
1977
.
[4]
Yannis E. Ioannidis,et al.
The History of Histograms (abridged)
,
2003,
VLDB.
[5]
Wolfram Wöß,et al.
A Semantic Web middleware for Virtual Data Integration on the Web
,
2008,
ESWC.
[6]
Patrick Valduriez,et al.
Principles of distributed database systems (2nd ed.)
,
1999
.
[7]
Patrick Valduriez,et al.
Principles of Distributed Database Systems
,
1990
.
[8]
Jun Zhao,et al.
Describing Linked Datasets On the Design and Usage of voiD, the "Vocabulary Of Interlinked Datasets"
,
2009
.