Fast String Searching on PISA Theo Jepsen

This paper presents PPS, a system for locating occurrences of string keywords stored in the payload of packets using a programmable network ASIC. The PPS compiler first converts keywords into Deterministic Finite Automata (DFA) representations, and then maps the DFA into a sequence of forwarding tables in the switch pipeline. Our design leverages several hardware primitives (e.g., TCAM, hashing, parallel tables) to achieve high throughput. Our evaluation shows that PPS demonstrates significantly higher throughput and lower latency than string searches running on CPUs, GPUs, or FPGAs.

[1]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[2]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[3]  Richard M. Karp,et al.  Efficient Randomized Pattern-Matching Algorithms , 1987, IBM J. Res. Dev..

[4]  A. Piperno,et al.  2003 , 2003, Intensive Care Medicine.

[5]  Timothy Sherwood,et al.  A high throughput string matching architecture for intrusion detection and prevention , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[6]  Sotiris Ioannidis,et al.  Regular Expression Matching on Graphics Hardware for Intrusion Detection , 2009, RAID.

[7]  Anat Bremler-Barr,et al.  CompactDFA: Generic State Machine Compression for Scalable Pattern Matching , 2010, 2010 Proceedings IEEE INFOCOM.

[8]  Eric Torng,et al.  Fast Regular Expression Matching Using Small TCAMs for Network Intrusion Detection and Prevention Systems , 2010, USENIX Security Symposium.

[9]  Sungryoul Lee,et al.  Kargus: a highly-scalable software-based intrusion detection system , 2012, CCS.

[10]  Scott Shenker,et al.  Network support for resource disaggregation in next-generation datacenters , 2013, HotNets.

[11]  George Varghese,et al.  Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.

[12]  Dafang Zhang,et al.  Scalable TCAM-based regular expression matching with compressed finite automata , 2013, Architectures for Networking and Communications Systems.

[13]  George Varghese,et al.  P4: programming protocol-independent packet processors , 2013, CCRV.

[14]  Arjun Guha,et al.  A fast compiler for NetKAT , 2015, ICFP.

[15]  Thomas F. Wenisch,et al.  HARE: Hardware accelerator for regular expressions , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Thomas F. Wenisch,et al.  HAWK: Hardware support for unstructured log processing , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[17]  Dongsu Han,et al.  DFC: Accelerating String Pattern Matching for Network Applications , 2016, NSDI.

[18]  Noa Zilberman,et al.  From photons to big-data applications: terminating terabits , 2015, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[19]  Lucas Vespa,et al.  A high-throughput DPI engine on GPU via algorithm/implementation co-optimization , 2016, J. Parallel Distributed Comput..

[20]  Anirudh Sivaraman,et al.  Language-Directed Hardware Design for Network Performance Monitoring , 2017, SIGCOMM.

[21]  Christof Fetzer,et al.  StreamApprox: approximate computing for stream analytics , 2017, Middleware.

[22]  Panos Kalnis,et al.  In-Network Computation is a Dumb Idea Whose Time Has Come , 2017, HotNets.

[23]  Peter Bailis,et al.  Filter Before You Parse: Faster Analytics on Raw Data with Sparser , 2018, Proc. VLDB Endow..

[24]  Andrew W. Moore,et al.  Understanding PCIe performance for end host networking , 2018, SIGCOMM.

[25]  Walter Willinger,et al.  Sonata: query-driven streaming network telemetry , 2018, SIGCOMM.

[26]  Srinivasan Seshan,et al.  Generic External Memory for Switch Data Planes , 2018, HotNets.

[27]  Robert Soulé,et al.  Life in the Fast Lane: A Line-Rate Linear Road , 2018, SOSR.

[28]  M. Slee,et al.  Thrift : Scalable Cross-Language Services Implementation , 2022 .