Non-native RDF Storage Engines

The proliferation of heterogeneous Linked Data requires data management systems to constantly improve their scalability and efficiency. Linked Data can be stored according to many different data storage models. Some of these attempt to use general purpose database storage techniques to persist Linked Data, hence they can leverage existing data processing environments (e.g., big Hadoop clusters). We therefore look at the multiplicity of Linked Data storage systems which we categorize into the following classes: relational database-based systems, NoSQL-based systems, massively parallel systems.

[1]  Setrag Khoshafian,et al.  A decomposition storage model , 1985, SIGMOD Conference.

[2]  Haixun Wang,et al.  Trinity: a distributed graph engine on a memory cloud , 2013, SIGMOD '13.

[3]  Ioannis Konstantinou,et al.  H2RDF+: an efficient data management system for big RDF graphs , 2014, SIGMOD Conference.

[4]  Ioannis Konstantinou,et al.  H2RDF: adaptive query processing on RDF data in the cloud. , 2012, WWW.

[5]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[6]  Georg Lausen,et al.  PigSPARQL: A SPARQL Query Processing Baseline for Big Data , 2013, International Semantic Web Conference.

[7]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[8]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[9]  Andreas Harth,et al.  Optimized index structures for querying RDF from the Web , 2005, Third Latin American Web Congress (LA-WEB'2005).

[10]  Paul T. Groth,et al.  NoSQL Databases for RDF: An Empirical Evaluation , 2013, International Semantic Web Conference.

[11]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[12]  HyeongSik Kim,et al.  From SPARQL to MapReduce: The Journey Using a Nested TripleGroup Algebra , 2011, Proc. VLDB Endow..

[13]  Vassilis Christophides,et al.  On Storing Voluminous RDF Descriptions: The Case of Web Portal Catalogs , 2001, WebDB.

[14]  Adina Crainiceanu,et al.  SPARQL in the cloud using Rya , 2015, Inf. Syst..

[15]  Andreas Harth,et al.  CumulusRDF: Linked Data Management on Nested Key-Value Stores , 2011 .

[16]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[17]  Ioannis Konstantinou,et al.  H2RDF+: High-performance distributed joins over large-scale RDF graphs , 2013, 2013 IEEE International Conference on Big Data.

[18]  Padmashree Ravindra,et al.  Towards scalable RDF graph analytics on MapReduce , 2010, MDAC '10.

[19]  Georg Lausen,et al.  S2RDF: RDF Querying with SPARQL on Spark , 2015, Proc. VLDB Endow..

[20]  Dave Reynolds,et al.  Efficient RDF Storage and Retrieval in Jena2 , 2003, SWDB.

[21]  Daniel J. Abadi,et al.  Scalable SPARQL querying of large RDF graphs , 2011, Proc. VLDB Endow..

[22]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[23]  Joseph K. Bradley,et al.  Spark SQL: Relational Data Processing in Spark , 2015, SIGMOD Conference.

[24]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[25]  François Goasdoué,et al.  CliqueSquare: Flat plans for massively parallel RDF queries , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[26]  Raphael Volz,et al.  Practical and Scalable Semantic Systems (PSSS1) : Proceedings of the First International Workshop on Practical and Scalable Semantic Systems, Sanibel Island, Florida, USA, October 20, 2003 , 2003 .

[27]  Vassilis Christophides,et al.  Heuristics-based query optimisation for SPARQL , 2012, EDBT '12.

[28]  Haixun Wang,et al.  A Distributed Graph Engine for Web Scale RDF Data , 2013, Proc. VLDB Endow..

[29]  Richard E. Schantz,et al.  Clause-iteration with MapReduce to scalably query datagraphs in the SHARD graph-store , 2011, DIDC '11.

[30]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[31]  François Goasdoué,et al.  CliqueSquare in action: Flat plans for massively parallel RDF queries , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[32]  Bhavani M. Thuraisingham,et al.  Jena-HBase: A Distributed, Scalable and Effcient RDF Triple Store , 2012, SEMWEB.

[33]  Huajun Chen,et al.  SparkRDF: Elastic Discreted RDF Graph Processing Engine With Distributed Memory , 2014, 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT).

[34]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[35]  Eugene Inseok Chong,et al.  An Efficient SQL-based RDF Querying Scheme , 2005, VLDB.

[36]  Philippe Cudré-Mauroux,et al.  DiploCloud: Efficient and Scalable Management of RDF Data in the Cloud , 2016, IEEE Transactions on Knowledge and Data Engineering.

[37]  Georg Lausen,et al.  S2X: Graph-Parallel Querying of RDF with GraphX , 2015, Big-O/DMAH@VLDB.

[38]  HyeongSik Kim,et al.  An Intermediate Algebra for Optimizing RDF Graph Pattern Matching on MapReduce , 2011, ESWC.

[39]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[40]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[41]  Philippe Cudré-Mauroux,et al.  dipLODocus[RDF] - Short and Long-Tail RDF Analytics for Massive Webs of Data , 2011, SEMWEB.

[42]  Rakesh Agrawal,et al.  Storage and Querying of E-Commerce Data , 2001, VLDB.

[43]  Kevin Wilkinson,et al.  Jena Property Table Implementation , 2006 .

[44]  François Goasdoué,et al.  AMADA: web data repositories in the amazon cloud , 2012, CIKM.

[45]  Sherif Sakr,et al.  Relational processing of RDF queries: a survey , 2010, SGMD.

[46]  Dirk Grunwald,et al.  Using vertex-centric programming platforms to implement SPARQL queries on large graphs , 2014, IA3 '14.

[47]  Brian McBride,et al.  Jena: A Semantic Web Toolkit , 2002, IEEE Internet Comput..