A Review of RDF Storage in NoSQL Databases

The Resource Description Framework (RDF) is a model for representing information resources on the Web. With the widespread acceptance of RDF as the de-facto standard recommended by W3C (World Wide Web Consortium) for the representation and exchange of information on the Web, a huge amount of RDF data is being proliferated and becoming available. So RDF data management is of increasing importance, and has attracted attentions in the database community as well as the Semantic Web community. Currently much work has been devoted to propose different solutions to store large-scale RDF data efficiently. In order to manage massive RDF data, NoSQL (“not only SQL”) databases have been used for scalable RDF data store. This chapter focuses on using various NoSQL databases to store massive RDF data. An up-to-date overview of the current state of the art in RDF data storage in NoSQL databases is provided. The chapter aims at suggestions for future research.

[1]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[2]  Miriam A. M. Capretz,et al.  Data management in cloud environments: NoSQL and NewSQL data stores , 2013, Journal of Cloud Computing: Advances, Systems and Applications.

[3]  Bhavani M. Thuraisingham,et al.  Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[4]  Brian McBride,et al.  Jena: A Semantic Web Toolkit , 2002, IEEE Internet Comput..

[5]  Bhavani M. Thuraisingham,et al.  Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing , 2011, IEEE Transactions on Knowledge and Data Engineering.

[6]  Syed Akhter Hossain,et al.  NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison , 2013, ArXiv.

[7]  Adam Barker,et al.  Undefined By Data: A Survey of Big Data Definitions , 2013, ArXiv.

[8]  Jaroslav Pokorný,et al.  NoSQL databases: a step to database scalability in web environment , 2011, iiWAS '11.

[9]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[10]  Daniel J. Abadi,et al.  SW-Store: a vertically partitioned DBMS for Semantic Web data management , 2009, The VLDB Journal.

[11]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[12]  Julian Dolby,et al.  Building an efficient RDF store over a relational database , 2013, SIGMOD '13.

[13]  Luis Martín,et al.  RDFBuilder: A tool to automatically build RDF-based interfaces for MAGE-OM microarray data sources , 2013, Comput. Methods Programs Biomed..

[14]  Taehyung Wang,et al.  Analysis of Big Data Technologies and Method - Query Large Web Public RDF Datasets on Amazon Cloud Using Hadoop and Open Source Parsers , 2013, 2013 IEEE Seventh International Conference on Semantic Computing.

[15]  Ahmad Ghafarian,et al.  A Computer Forensics Approach Based on Autonomous Intelligent Multi-Agent System , 2013 .

[16]  Carole A. Goble,et al.  Quality, trust, and utility of scientific data on the web: towards a joint model , 2011, WebSci '11.

[17]  Vijay V. Raghavan,et al.  NoSQL Systems for Big Data Management , 2014, 2014 IEEE World Congress on Services.

[18]  Xiaoyong Du,et al.  FlexTable: Using a Dynamic Relation Model to Store RDF Data , 2010, DASFAA.

[19]  Gerhard Weikum,et al.  RDF-3X: a RISC-style engine for RDF , 2008, Proc. VLDB Endow..

[20]  Ioannis Konstantinou,et al.  H2RDF: adaptive query processing on RDF data in the cloud. , 2012, WWW.

[21]  Sherif Sakr,et al.  Relational processing of RDF queries: a survey , 2010, SGMD.

[22]  Paul T. Groth,et al.  NoSQL Databases for RDF: An Empirical Evaluation , 2013, International Semantic Web Conference.

[23]  Hao Wu,et al.  Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks , 2014, J. Parallel Distributed Comput..

[24]  John Abraham,et al.  Distributed Semantic Web Data Management in HBase and MySQL Cluster , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[25]  Aleksandra Werner,et al.  Standardization of NoSQL Database Languages , 2014, BDAS.

[26]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[27]  Claudio Gutiérrez,et al.  Querying RDF Data from a Graph Database Perspective , 2005, ESWC.

[28]  Clarence J M Tauro,et al.  Comparative Study of the New Generation, Agile, Scalable, High Performance NOSQL Databases , 2012 .

[29]  Gerhard Weikum,et al.  The RDF-3X engine for scalable management of RDF data , 2010, The VLDB Journal.

[30]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[31]  Ioannis Konstantinou,et al.  H2RDF+: High-performance distributed joins over large-scale RDF graphs , 2013, 2013 IEEE International Conference on Big Data.

[32]  Sung Wan Kim Hybrid Storage Scheme for RDF Data Management in Semantic Web , 2006, J. Digit. Inf. Manag..

[33]  Octavian Udrea,et al.  Apples and oranges: a comparison of RDF benchmarks and real RDF datasets , 2011, SIGMOD '11.

[34]  Ralf Hartmut Güting,et al.  GraphDB: Modeling and Querying Graphs in Databases , 1994, VLDB.

[35]  Barry Bishop,et al.  OWLIM: A family of scalable semantic repositories , 2011, Semantic Web.

[36]  Stefan Jablonski,et al.  NoSQL evaluation: A use case oriented survey , 2011, 2011 International Conference on Cloud and Service Computing.

[37]  Li Fu,et al.  Scalable RDF Graph Querying Using Cloud Computing , 2013, J. Web Eng..

[38]  Abraham Bernstein,et al.  Hexastore: sextuple indexing for semantic web data management , 2008, Proc. VLDB Endow..

[39]  Jan Hidders,et al.  Storing and Indexing Massive RDF Datasets , 2012, Semantic Search over the Web.

[40]  Daniel J. Abadi,et al.  Scalable SPARQL querying of large RDF graphs , 2011, Proc. VLDB Endow..

[41]  Ioana Manolescu,et al.  RDF in the clouds: a survey , 2014, The VLDB Journal.

[42]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.