Acceleration of XML Parsing through Prefetching

Extensible Markup Language (XML) has become a widely adopted standard for data representation and exchange. However, its features also introduce significant overhead threatening the performance of modern applications. In this paper, we present a study of XML parsing and determine that memory-side data loading in the parsing stage incurs a significant performance overhead, as much as the computation does. Hence, we propose memory-side acceleration which incorporates of data prefetching techniques, and can be applied on top of computation-side acceleration to speed up the XML data parsing. To this end, we study here the impact of our proposed scheme on the performance and energy consumption and demonstrated how it is capable of improving performance by up to 20 percent as well as produce up to 12.77 percent of energy saving when implemented in 32-nm technology. In addition, we implement a prefetcher on an platform in an effort to evaluate its implementation feasibility in terms of area and energy overhead.

[1]  Babak Falsafi,et al.  Dead-block prediction & dead-block correlating prefetchers , 2001, ISCA 2001.

[2]  Wei Zhang,et al.  Benchmarking XML Processors for Applications in Grid Web Services , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[3]  Babak Falsafi,et al.  Spatial Memory Streaming with Rotated Patterns , 2009 .

[4]  Scott A. Mahlke,et al.  Data access microarchitectures for superscalar processors with compiler-assisted data prefetching , 1991, MICRO 24.

[5]  Hsien-Hsin S. Lee,et al.  Data Prefetching by Exploiting Global and Local Access Patterns , 2011, J. Instr. Level Parallelism.

[6]  Onur Mutlu,et al.  Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[7]  XML parsing: a threat to database performance , 2003, CIKM '03.

[8]  Olivier Temam,et al.  MicroLib: A Case for the Quantitative Comparison of Micro-Architecture Mechanisms , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[9]  Jean-Loup Baer,et al.  An effective on-chip preloading scheme to reduce data access penalty , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[10]  B. Jacob,et al.  CMP $ im : A Pin-Based OnThe-Fly Multi-Core Cache Simulator , 2008 .

[11]  Madhusudhan Govindaraju,et al.  Investigating the limits of SOAP performance for scientific computing , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[12]  Ken Kennedy,et al.  Software methods for improvement of cache performance on supercomputer applications , 1989 .

[13]  Robert D. Cameron,et al.  High performance XML parsing using parallel bit stream technology , 2008, CASCON '08.

[14]  John W. Lockwood,et al.  Reconfigurable content-based router using hardware-accelerated language parser , 2008, TODE.

[15]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[16]  Víctor Viñals,et al.  Multi-level Adaptive Prefetching based on Performance Gradient Tracking , 2011, J. Instr. Level Parallelism.

[17]  Ken Kennedy,et al.  Software prefetching , 1991, ASPLOS IV.

[18]  J.W.C. Fu,et al.  Stride Directed Prefetching In Scalar Processors , 1992, [1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 25.

[19]  Alan Jay Smith,et al.  Cache Memories , 1982, CSUR.

[20]  Michael R. Head,et al.  Approaching a parallelized XML parser optimized for multi-coreprocessors , 2007, SOCP '07.

[21]  Huiyang Zhou,et al.  Combining Local and Global History for High Performance Data Prefetching , 2011, J. Instr. Level Parallelism.

[22]  Li Zhao,et al.  Performance Evaluation and Acceleration for XML Data Parsing , .

[23]  David M. Koppelman,et al.  A Hybrid Adaptive Feedback Based Prefetcher , 2009 .

[24]  Wei Lu,et al.  A Parallel Approach to XML Parsing , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[25]  Arnaud Le Hors,et al.  Document Object Model (DOM) Level 2 Core Specification - Version 1.0 , 2000 .

[26]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[27]  Jianwen Zhu,et al.  A 1 cycle-per-byte XML parsing accelerator , 2010, FPGA '10.

[28]  H. Levy,et al.  An architecture for software-controlled data prefetching , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.

[29]  Magnus Jahre,et al.  Storage Efficient Hardware Prefetching using Delta-Correlating Prediction Tables , 2011, J. Instr. Level Parallelism.

[30]  Lu Peng,et al.  Enhancement for Accurate Stream Prefetching , 2009 .

[31]  Alan Jay Smith,et al.  Sequential Program Prefetching in Memory Hierarchies , 1978, Computer.

[32]  Michael R. Head,et al.  Grid scheduling and protocols - Benchmarking XML processors for applications in grid web services , 2006, SC.

[33]  Ricardo Morin,et al.  Architectural characterization of an XML-centric commercial server workload , 2004, International Conference on Parallel Processing, 2004. ICPP 2004..

[34]  Wei Lu,et al.  A binary XML for scientific applications , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).