Boosting XML Filtering with a Scalable FPGA-based Architecture

growing amount of XML encoded data exchanged over the In- ternet increases the importance of XML based publish-subscribe (pub-sub) and content based routing systems. The input in such systems typically consists of a stream of XML documents and a set of user subscriptions expressed as XML queries. The pub-sub system then filters the published documents and passes them t o the subscribers. Pub-sub systems are characterized by very high input ratios, therefore the processing time is critical. In this p aper we propose a "pure hardware" based solution, which utilizes XPath query blocks on FPGA to solve the filtering problem. By utiliz - ing the high throughput that an FPGA provides for parallel pro- cessing, our approach achieves drastically better through put than the existing software or mixed (hardware/software) architectures. The XPath queries (subscriptions) are translated to regular expres- sions which are then mapped to FPGA devices. By introducing stacks within the FPGA we are able to express and process a wide range of path queries very efficiently, on a scalable environ ment. Moreover, the fact that the parser and the filter processing a re per- formed on the same FPGA chip, eliminates expensive communi- cation costs (that a multi-core system would need) thus enabling very fast and efficient pipelining. Our experimental evalua tion re- veals more than one order of magnitude improvement compared to traditional pub/sub systems.

[1]  Chun-Shin Lin,et al.  FPGA Acceleration of Information Management Services , 2005 .

[2]  Dan Suciu,et al.  Stream processing of XPath queries with predicates , 2003, SIGMOD '03.

[3]  Dan Suciu,et al.  Processing XML streams with deterministic automata and stream indexes , 2004, TODS.

[4]  Wenfei Fan,et al.  Using partial evaluation in distributed query evaluation , 2006, VLDB.

[5]  Bongki Moon,et al.  PRIX: indexing and querying XML using prufer sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[6]  Denilson Barbosa,et al.  ToXgene: a template-based data generator for XML , 2002, SIGMOD '02.

[7]  Petko Bakalov,et al.  Early Profile Pruning on XML-aware Publish/Subscribe Systems , 2007, VLDB.

[8]  Wenfei Fan,et al.  Distributed query evaluation with performance guarantees , 2007, SIGMOD '07.

[9]  Sudarshan S. Chawathe,et al.  XPath queries on streaming data , 2003, SIGMOD '03.

[10]  Scott E. Spetka,et al.  Imagery Pattern Recognition and Pub/Sub Information Management , 2007, 36th Applied Imagery Pattern Recognition Workshop (aipr 2007).

[11]  Joonho Kwon,et al.  FiST: Scalable XML Document Filtering by Sequencing Twig Patterns , 2005, VLDB.

[12]  Laxmi N. Bhuyan,et al.  Compiling PCRE to FPGA for accelerating SNORT IDS , 2007, ANCS '07.

[13]  Cheng-Hung Lin,et al.  Optimization of Regular Expression Pattern Matching Circuits on FPGA , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[14]  Frank Vahid,et al.  A quantitative analysis of the speedup factors of FPGAs over processors , 2004, FPGA '04.

[15]  Hao Zhang,et al.  Path sharing and predicate evaluation for high-performance XML filtering , 2003, TODS.

[16]  J. V. Lunteren,et al.  XML Accelerator Engine , 2004 .

[17]  Bertram Ludäscher,et al.  A Transducer-Based XML Query Processor , 2002, VLDB.

[18]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[19]  Wei Lu,et al.  ParaXML : A Parallel XML Processing Model on the Multicore CPUs , 2007 .

[20]  John W. Lockwood,et al.  Reconfigurable content-based router using hardware-accelerated language parser , 2008, TODE.

[21]  Viktor K. Prasanna,et al.  Fast Regular Expression Matching Using FPGAs , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[22]  Wei Lu,et al.  A Parallel Approach to XML Parsing , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[23]  Christopher R. Clark,et al.  An FPGA-based network intrusion detection system with on-chip network interfaces , 2006 .