A Unifying Query Model to Explain Heterogeneity of RDF Stream Processing Systems

RDF and SPARQL are established standards for data interchange and querying on the Web. While they have been shown to be useful and applicable in many scenarios, they are not sufficiently adequate for dealing with streams of data and their intrinsic continuous nature. In the last years data and query languages have been proposed to extend both RDF and SPARQL for streams and continuous processing, under the name of RDF Stream Processing – RSP. These efforts resulted in several models and implementations that, at a first look, appear to propose alternative syntaxes but equivalent semantics. However, when asked to continuously answer the same queries on the same data streams, they provide different answers at disparate moments due to the heterogeneity of their operational semantics. These discrepancies render the process of understanding and comparing continuous query results complex and misleading. In this work, the authors propose RSP-QL, a comprehensive model that formally defines the semantics of an RSP system. RSP-QL makes explicit the hidden assumptions of currently available RSP systems, allows defining a formal notion of correctness for RSP query results and, thus, explains why available implementations provide different answers at disparate moments. RSP-QL Semantics: A Unifying Query Model to Explain Heterogeneity of RDF Stream Processing Systems