RDF analytics: lenses over semantic graphs

The development of Semantic Web (RDF) brings new requirements for data analytics tools and methods, going beyond querying to semantics-rich analytics through warehouse-style tools. In this work, we fully redesign, from the bottom up, core data analytics concepts and tools in the context of RDF data, leading to the first complete formal framework for warehouse-style RDF analytics. Notably, we define i) analytical schemas tailored to heterogeneous, semantics-rich RDF graph, ii) analytical queries which (beyond relational cubes) allow flexible querying of the data and the schema as well as powerful aggregation and iii) OLAP-style operations. Experiments on a fully-implemented platform demonstrate the practical interest of our approach.

[1]  Lorena Etcheverry,et al.  Enhancing OLAP Analysis with Web Cubes , 2012, ESWC.

[2]  Giovanni Tummarello,et al.  Introducing RDF Graph Summary with Application to Assisted SPARQL Formulation , 2012, 2012 23rd International Workshop on Database and Expert Systems Applications.

[3]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[4]  Jeff Z. Pan,et al.  Resource Description Framework , 2020, Definitions.

[5]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[6]  François Goasdoué,et al.  Warehousing RDF Graphs , 2013 .

[7]  Gerhard Weikum,et al.  The RDF-3X engine for scalable management of RDF data , 2010, The VLDB Journal.

[8]  Julian Dolby,et al.  Building an efficient RDF store over a relational database , 2013, SIGMOD '13.

[9]  Bhavani M. Thuraisingham,et al.  Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing , 2011, IEEE Transactions on Knowledge and Data Engineering.

[10]  M. Jarke,et al.  Fundamentals of Data Warehouses , 2003, Springer Berlin Heidelberg.

[11]  Yannis Kotidis,et al.  Business intelligence on complex graph data , 2012, EDBT-ICDT '12.

[12]  François Goasdoué,et al.  View Selection in Semantic Web Databases , 2011, Proc. VLDB Endow..

[13]  Theodore Johnson,et al.  The MD-join: an operator for complex OLAP , 2001, Proceedings 17th International Conference on Data Engineering.

[14]  Marcelo Arenas,et al.  nSPARQL: A Navigational Language for RDF , 2008, SEMWEB.

[15]  Feifei Li,et al.  Scalable Multi-query Optimization for SPARQL , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[16]  Daniel J. Abadi,et al.  Scalable SPARQL querying of large RDF graphs , 2011, Proc. VLDB Endow..

[17]  Andreas Harth,et al.  No Size Fits All - Running the Star Schema Benchmark with SPARQL and RDF Aggregate Views , 2013, ESWC.

[18]  Rafael Berlanga Llavori,et al.  Building data warehouses with semantic data , 2010, EDBT '10.

[19]  Serge Abiteboul,et al.  Managing an XML Warehouse in a P2P Context , 2003, CAiSE.

[20]  Abraham Bernstein,et al.  Hexastore: sextuple indexing for semantic web data management , 2008, Proc. VLDB Endow..

[21]  Feifei Li,et al.  Rewriting queries on SPARQL views , 2011, WWW.

[22]  Rafael Berlanga Llavori,et al.  Building data warehouses with semantic web data , 2012, Decis. Support Syst..

[23]  Jiawei Han,et al.  Graph cube: on warehousing and OLAP multidimensional networks , 2011, SIGMOD '11.

[24]  Gerhard Weikum,et al.  Active knowledge: dynamically enriching RDF knowledge bases by web services , 2010, SIGMOD Conference.

[25]  Julia Stoyanovich,et al.  Viewing the Web as a Distributed Knowledge Base , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[26]  Ioana Manolescu,et al.  Web Data Management , 2011 .

[27]  Torben Bach Pedersen,et al.  Multidimensional Databases and Data Warehousing , 2010, Multidimensional Databases and Data Warehousing.

[28]  Ioana Manolescu,et al.  Web Data Management and Distribution , 2010 .

[29]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[30]  Martin L. Kersten,et al.  Column-store support for RDF data management: not all swans are white , 2008, Proc. VLDB Endow..

[31]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[32]  François Goasdoué,et al.  Efficient query answering against dynamic RDF databases , 2013, EDBT '13.

[33]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.