Parallel Optimization of Queries in XML Dataset Using GPU

As XML is playing a crucial role in web services, databases, and document processing, efficient processing of XML queries has become an important issue. On the other hand, due to the increasing number of users, high throughput of XML queries is also required to execute tens of thousands of queries in a short time. Given the great success of GPGPU (General-Purpose computations on the Graphics Processors), we propose a parallel XML query model based on GPU, which mainly consists of two efficient task distribution strategies, to improve the efficiency and throughput of XML queries. We have developed a parallel simplified XPath language using Compute Unified Device Architecture (CUDA) on GPU, and evaluate our model on a recent NVIDIA GPU in comparison with its counterpart on eight-core CPU. The experiment results show that our model achieves both higher throughput and efficiency than CPU-based XML query.

[1]  Oded Shmueli,et al.  Parallelization of XPath queries using multi-core processors: challenges and experiences , 2009, EDBT '09.

[2]  Sherif Sakr,et al.  Dependable cardinality forecasts for XQuery , 2008, Proc. VLDB Endow..

[3]  Chun Zhang,et al.  Cost-based optimization in DB2 XML , 2006, IBM Syst. J..

[4]  Torsten Suel,et al.  Using graphics processors for high-performance IR query processing , 2008, WWW.

[5]  Ying Zhang,et al.  A Static Load-Balancing Scheme for Parallel XML Parsing on Multicore CPUs , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[6]  Kevin Skadron,et al.  A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..

[7]  Lipyeow Lim,et al.  Statistics-based parallelization of XPath queries in shared memory systems , 2010, EDBT '10.

[8]  Bingsheng He,et al.  High-Throughput Transaction Executions on Graphics Processors , 2011, Proc. VLDB Endow..

[9]  Jim Melton,et al.  Advancements in SQL/XML , 2004, SGMD.

[10]  Ioana Manolescu,et al.  Why and how to benchmark XML databases , 2001, SGMD.

[11]  Wei Lu,et al.  Parallel XML processing by work stealing , 2007, SOCP '07.

[12]  Wei Lu,et al.  A Parallel Approach to XML Parsing , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[13]  Andrey Balmin,et al.  Grouping and Optimization of XPath Expressions in System RX , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Jeffrey F. Naughton,et al.  Estimating the Selectivity of XML Path Expressions for Internet Scale Applications , 2001, VLDB.

[15]  Sherif Sakr,et al.  Towards a comprehensive assessment for selectivity estimation approaches of XML queries , 2010, Int. J. Web Eng. Technol..