Jumpgate: automating integration of network connected accelerators

Network-connected accelerators (NCA), such as programmable switches, ASICs, and FPGAs can speed up operations in data analytics. But so far, integration of NCAs into data analytics systems required manual effort. We present Jumpgate, a system that simplifies integration of existing NCA code into data analytics systems, such as Apache Spark or Presto. Jumpgate places most of the integration code into the analytics system, which needs to be written once, leaving NCA programmers to write only a couple hundred lines of code to integrate new NCAs. Jumpgate relies on dataflow graphs that most analytics systems use internally, and takes care of the invocation of NCAs, the necessary format conversion, and orchestration of their execution via novel staged network pipelines. Our implementation of Jumpgate in Apache Spark made it possible, for the first time, to study the benefits and drawbacks of using NCAs across the entire range of queries in the TPC-DS benchmark. Since we lack hardware that can accelerate all analytics operations, we implemented NCAs in software. We report on how and when analytics workloads will benefit from NCAs to motivate future designs.

[1]  Saman P. Amarasinghe,et al.  Weld : A Common Runtime for High Performance Data Analytics , 2016 .

[2]  Xiaozhou Li,et al.  NetChain: Scale-Free Sub-RTT Coordination , 2018, NSDI.

[3]  Samuel Madden,et al.  Evaluating End-to-End Optimization for Data Analytics Applications in Weld , 2018, Proc. VLDB Endow..

[4]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[5]  David J. DeWitt,et al.  Query processing on smart SSDs: opportunities and challenges , 2013, SIGMOD '13.

[6]  Panos Kalnis,et al.  Scaling Distributed Machine Learning with In-Network Aggregation , 2019, NSDI.

[7]  Walter Willinger,et al.  Sonata: query-driven streaming network telemetry , 2018, SIGCOMM.

[8]  Panos Kalnis,et al.  In-Network Computation is a Dumb Idea Whose Time Has Come , 2017, HotNets.

[9]  Animesh Trivedi,et al.  Albis: High-Performance File Format for Big Data Systems , 2018, USENIX Annual Technical Conference.

[10]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[11]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[12]  Xiaoyu Chen,et al.  JetScope: Reliable and Interactive Analytics at Cloud Scale , 2015, Proc. VLDB Endow..

[13]  Heqing Zhu Data Plane Development Kit (DPDK) , 2020 .

[14]  Xin Jin,et al.  Harmonia: Near-Linear Scalability for Replicated Storage with In-Network Conflict Detection , 2019, Proc. VLDB Endow..

[15]  David Phillips,et al.  Presto: SQL on Everything , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[16]  Ivan Beschastnikh,et al.  Jumpgate: In-Network Processing as a Service for Data Analytics , 2019, HotCloud.

[17]  Mike Dubman,et al.  Scalable Hierarchical Aggregation Protocol (SHArP): A Hardware Architecture for Efficient Data Reduction , 2016, 2016 First International Workshop on Communication Optimizations in HPC (COMHPC).

[18]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .

[19]  Amin Vahdat,et al.  Snap: a microkernel approach to host networking , 2019, SOSP.

[20]  Amin Vahdat,et al.  Themis: an I/O-efficient MapReduce , 2012, SoCC '12.

[21]  Vasiliki Kalavri,et al.  Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows , 2018, OSDI.

[22]  Gustavo Alonso,et al.  Caribou: Intelligent Distributed Storage , 2017, Proc. VLDB Endow..

[23]  Viktor Leis,et al.  How Good Are Query Optimizers, Really? , 2015, Proc. VLDB Endow..

[24]  Steven Hand,et al.  Musketeer: all for one, one for all in data processing systems , 2015, EuroSys.

[25]  Raghunath Othayoth Nambiar,et al.  The making of TPC-DS , 2006, VLDB.

[26]  Sungjin Lee,et al.  BlueCache: A Scalable Distributed Flash-based Key-value Store , 2016, Proc. VLDB Endow..

[27]  Charles E. Leiserson,et al.  On-the-Fly Pipeline Parallelism , 2015, ACM Trans. Parallel Comput..

[28]  Minlan Yu,et al.  Cheetah: Accelerating Database Queries with Switch Pruning , 2019, SIGCOMM Posters and Demos.

[29]  Peter Bailis,et al.  Filter Before You Parse: Faster Analytics on Raw Data with Sparser , 2018, Proc. VLDB Endow..

[30]  Arvind Krishnamurthy,et al.  E3: Energy-Efficient Microservices on SmartNIC-Accelerated Servers , 2019, USENIX ATC.

[31]  Byung-Gon Chun,et al.  Apache Nemo: A Framework for Building Distributed Dataflow Optimization Policies , 2019, USENIX Annual Technical Conference.

[32]  Meikel Pöss,et al.  TPC-DS, taking decision support benchmarking to the next level , 2002, SIGMOD '02.

[33]  Rastislav Bodík,et al.  Floem: A Programming System for NIC-Accelerated Network Applications , 2018, OSDI.

[34]  Geoff Langdale,et al.  Parsing gigabytes of JSON per second , 2019, The VLDB Journal.

[35]  Philippe Cudré-Mauroux,et al.  The Case for Network Accelerated Query Processing , 2019, CIDR.

[36]  Steven Swanson,et al.  Summarizer: Trading Communication with Computing Near Storage , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[37]  Thomas E. Anderson,et al.  Ingress Pipeline Queues Packet Buffer DMA PipelineDMA Egress Pipeline , 2015 .

[38]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[39]  Mohsine Eleuldj,et al.  OpenStack: Toward an Open-source Solution for Cloud Computing , 2012 .

[40]  Nate Foster,et al.  NetCache: Balancing Key-Value Stores with Fast In-Network Caching , 2017, SOSP.

[41]  Karan Gupta,et al.  Offloading distributed applications onto smartNICs using iPipe , 2019, SIGCOMM.

[42]  Sangyeun Cho,et al.  YourSQL: A High-Performance Database System Leveraging In-Storage Computing , 2016, Proc. VLDB Endow..

[43]  Badrish Chandramouli,et al.  Mison: A Fast JSON Parser for Data Analytics , 2017, Proc. VLDB Endow..

[44]  Robert Soulé,et al.  Fast String Searching on PISA , 2019, SOSR.

[45]  M. Abadi,et al.  Naiad: a timely dataflow system , 2013, SOSP.

[46]  Jinyoung Lee,et al.  Biscuit: A Framework for Near-Data Processing of Big Data Workloads , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).