REGISTOR: A Platform for Unstructured Data Processing Inside SSD Storage

This paper presents REGISTOR, a platform for regular expression grabbing inside storage. The main idea of Registor is accelerating regular expression (regex) search inside storage where large data set is stored, eliminating the I/O bottleneck problem. A special hardware engine for regex search is designed and augmented inside flash SSD that processes data on-the-fly during data transmission from NAND flash to host. In order to make the speed of regex search match the internal bus speed of modern SSD, a deep pipeline structure is designed in Registor hardware consisting of file semantics extractor, matching candidates finder, regex matching units (REMUs) and results organizer. Furthermore, each stage of the pipeline makes use of maximal parallelism possible. To make Registor readily usable by high level applications, we have developed a set of APIs and libraries in Linux allowing Registor to process files in SSD by recombining separate data blocks into files efficiently. A working prototype of Registor has been built in our newly designed NVMe-SSD. Extensive experiments and analyses have been carried out to show that Registor achieves high throughput, reduces I/O bandwidth requirement by up to 97% and CPU utilization by as much as 82% for regex search in large data sets.

[1]  Wolfram Schulte,et al.  Data-parallel finite-state machines , 2014, ASPLOS.

[2]  Ken Thompson,et al.  Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.

[3]  Peter Desnoyers,et al.  Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines , 2013, FAST.

[4]  Andrew A. Chien,et al.  UDP: A Programmable Accelerator for Extract-Transform-Load Workloads and More , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[5]  Wenji Mao,et al.  Social Computing: From Social Informatics to Social Intelligence , 2007, IEEE Intell. Syst..

[6]  Manos Athanassoulis,et al.  Beyond the Wall: Near-Data Processing for Databases , 2015, DaMoN.

[7]  Stefano Giordano,et al.  An improved DFA for fast regular expression matching , 2008, CCRV.

[8]  Ron K. Cytron,et al.  A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[9]  Yang Liu,et al.  Willow: A User-Programmable SSD , 2014, OSDI.

[10]  Avita Katal,et al.  Big data: Issues, challenges, tools and Good practices , 2013, 2013 Sixth International Conference on Contemporary Computing (IC3).

[11]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[12]  Jinyoung Lee,et al.  Biscuit: A Framework for Near-Data Processing of Big Data Workloads , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[13]  Jan van Lunteren,et al.  Hardware-accelerated regular expression matching at multiple tens of Gb/s , 2012, 2012 Proceedings IEEE INFOCOM.

[14]  Thomas F. Wenisch,et al.  HAWK: Hardware support for unstructured log processing , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[15]  G. Nolan,et al.  Computational solutions to large-scale data management and analysis , 2010, Nature Reviews Genetics.

[16]  Peter Weiner,et al.  Linear Pattern Matching Algorithms , 1973, SWAT.

[17]  Robert D. Cameron,et al.  Parabix: Boosting the efficiency of text processing on commodity processors , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[18]  Timothy Sherwood,et al.  A high throughput string matching architecture for intrusion detection and prevention , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[19]  Shahriar Akter,et al.  Big data analytics in E-commerce: a systematic review and agenda for future research , 2016, Electronic Markets.

[20]  Viktor K. Prasanna,et al.  Fast Regular Expression Matching Using FPGAs , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[21]  Xiaodong Yu,et al.  GPU acceleration of regular expression matching for large datasets: exploring the implementation space , 2013, CF '13.

[22]  Patrick Crowley,et al.  A workload for evaluating deep packet inspection architectures , 2008, 2008 IEEE International Symposium on Workload Characterization.

[23]  Andrew A. Chien,et al.  Fast support for unstructured data processing: The unified automata processor , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[24]  Kiyoung Choi,et al.  A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[25]  Dan Lin,et al.  SQRL: Hardware accelerator for collecting software data structures , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[26]  Antonio Barbalace,et al.  It's Time to Think About an Operating System for Near Data Processing Architectures , 2017, HotOS.

[27]  Cheng-Hung Lin,et al.  Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs , 2013, IEEE Transactions on Computers.

[28]  Srinivas Aluru,et al.  High Performance Pattern Matching Using the Automata Processor , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[29]  Thomas F. Wenisch,et al.  HARE: Hardware accelerator for regular expressions , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[30]  Sungjin Lee,et al.  BlueDBM: An appliance for Big Data analytics , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[31]  Christoforos E. Kozyrakis,et al.  Practical Near-Data Processing for In-Memory Analytics Frameworks , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).

[32]  Dave Brown,et al.  Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .

[33]  Meng Lin,et al.  Bitwise data parallelism in regular expression matching , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[34]  Reetuparna Das,et al.  Parallel automata processor , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[35]  Patrick Crowley,et al.  Algorithms to accelerate multiple regular expressions matching for deep packet inspection , 2006, SIGCOMM.

[36]  Kevin Skadron,et al.  ANMLzoo: a benchmark suite for exploring bottlenecks in automata processing engines and architectures , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[37]  Tejas Karkhanis,et al.  Accelerating business analytics applications , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[38]  Viktor K. Prasanna,et al.  Compact architecture for high-throughput regular expression matching on FPGA , 2008, ANCS '08.

[39]  Steven Swanson,et al.  Morpheus: Creating Application Objects Efficiently for Heterogeneous Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[40]  Christoph Hagleitner,et al.  Designing a Programmable Wire-Speed Regular-Expression Matching Accelerator , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[41]  Gustavo Alonso,et al.  Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures , 2017, SIGMOD Conference.