论文信息 - Efficient and Scalable SPARQL Query Processing with Transformed Table

Efficient and Scalable SPARQL Query Processing with Transformed Table

Resource Description Framework (RDF) is the core technology of Semantic Web and has been more and more popular in recent years. With the rapid growth of the RDF data, the Triple Store, which is the query engine and RDF data storage, requires more scalable and efficient technologies. To improve the scalability and the performance of triple query, which is called SPARQL query processing, Map Reduce programming model and NoSQL database system such as H Base are well-known solutions for large scale data processing. However, in general case, the subject of a triple is regarded as Row Key in the table. In some queries, finding matched triple patterns is a time-consuming job. Therefore, we design another table with different storage schema called Transformed Table to reduce the time cost for read operation. The experimental results show that using Transformed Table can improve the triple query performance significantly.

Ce-Kuen Shieh | Ming-Fong Tsai | Sheng-Wei Huang | Chia-Ho Yu

[1] Jianling Sun,et al. Scalable RDF store based on HBase and MapReduce , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[2] Michael Stonebraker,et al. A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.

[3] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4] Jeff Heflin,et al. LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[5] Georg Lausen,et al. Cascading Map-Side Joins over HBase for Scalable Join Processing , 2012, SSWS+HPCSW@ISWC.

[6] Sanjay Ghemawat,et al. MapReduce: a flexible data processing tool , 2010, CACM.

[7] Ioannis Konstantinou,et al. H2RDF: adaptive query processing on RDF data in the cloud. , 2012, WWW.

[8] Dave Reynolds,et al. SPARQL basic graph pattern optimization using selectivity estimation , 2008, WWW.