Scalable reconstruction of RDF-archived relational databases

We have investigated approaches for scalable reconstruction of relational databases (RDBs) archived as RDF files. An archived RDB is reconstructed from a data archive file and a schema archive file, both in N-Triples formats. The archives contain RDF triples representing the archived relational data content and the relational schema describing the content, respectively. When an archived RDB is to be reconstructed, the schema archive is first read to automatically create the RDB schema using a schema reconstruction algorithm which identifies RDB elements by queries to the schema archive. The RDB thus created is then populated by reading the data archive. To populate the RDB we have developed two approaches, the naive Insert Attribute Value (IAV) and Triple Bulk Load (TBL). With the IAV approach the data is populated by stored procedures that execute SQL INSERT or UPDATE statements to insert attribute values in the RDB tables. In the more complex TBL approach the database is populated by bulk loading CSV files generated by sorting the data archive triples joined with schema information. Our experiments show that the TBL approach is substantially faster than the IAV approach.

[1]  Jane Hunter,et al.  Scientific Publication Packages - A Selective Approach to the Communication and Archival of Scientific Output , 2008, Int. J. Digit. Curation.

[2]  Daniel P. Miranker,et al.  Ultrawrap: SPARQL execution on relational data , 2013, J. Web Semant..

[3]  S. Chittayasothorn,et al.  A Transformation from RDF Documents and Schemas to Relational Databases , 2007, 2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[4]  Stefano Allegrezza The reliability of optical memories in the long-term preservation of digital documents , 2015 .

[5]  Frank van Harmelen,et al.  Information Sharing on the Semantic Web , 2004, Advanced Information and Knowledge Processing.

[6]  Bhavani M. Thuraisingham,et al.  R2D: A Bridge between the Semantic Web and Relational Visualization Tools , 2009, 2009 IEEE International Conference on Semantic Computing.

[7]  Miguel Ángel Márdero Arellano Preservação de documentos digitais , 2004 .

[8]  Bhavani M. Thuraisingham,et al.  R2D: Extracting Relational Structure from RDF Stores , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[9]  Keishi Tajima,et al.  Archiving scientific data , 2004, TODS.

[10]  Daniel P. Miranker,et al.  Ultrawrap : SPARQL Execution on Relational Data Technical Report , 2012 .