When FPGA-Accelerator Meets Stream Data Processing in the Edge

Today, stream data applications represent the killer applications for Edge computing: placing computation close to the data source facilitates real-time analysis. Previous efforts have focused on introducing light-weight distributed stream processing (DSP) systems and dividing the computation between Edge servers and the clouds. Unfortunately, given the limited computation power of Edge servers, current efforts may fail in practice to achieve the desired latency of stream data applications. In this vision paper, we argue that by introducing FPGAs in Edge servers and integrating them into DSP systems, we might be able to realize stream data processing in Edge infrastructures. We demonstrate that through the design, implementation, and evaluation of F-Storm, an FPGA-accelerated and general-purpose distributed stream processing system on Edge servers. F-Storm integrates PCIe-based FPGAs into Edge-based stream processing systems and provides accelerators as a service for stream data applications. We evaluate F-Storm using different representative stream data applications. Our experiments show that compared to Storm, F-Storm reduces the latency by 36% and 75% for matrix multiplication and grep application. It also obtains 1.4x and 2.1x improvement for these two applications, respectively. We expect this work to accelerate progress in this domain.

[1]  Yu Wang,et al.  FPMR: MapReduce framework on FPGA , 2010, FPGA '10.

[2]  Wei Zhang,et al.  Melia: A MapReduce Framework on OpenCL-Based FPGAs , 2016, IEEE Transactions on Parallel and Distributed Systems.

[3]  Jason Cong,et al.  From JVM to FPGA: Bridging Abstraction Hierarchy via Optimized Deep Pipelining , 2018, HotCloud.

[4]  Yogesh L. Simmhan,et al.  ECHO: An Adaptive Orchestration Platform for Hybrid Dataflows across Cloud and Edge , 2017, ICSOC.

[5]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[6]  Antony I. T. Rowstron,et al.  Scale-up vs scale-out for Hadoop: time to rethink? , 2013, SoCC.

[7]  Jameela Al-Jaroodi,et al.  SmartCityWare: A Service-Oriented Middleware for Cloud and Fog Enabled Smart City Services , 2017, IEEE Access.

[8]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.

[9]  Ying Xiong,et al.  Amino - A Distributed Runtime for Applications Running Dynamically Across Device, Edge and Cloud , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).

[10]  Hai Jin,et al.  TurboStream: Towards Low-Latency Data Stream Processing , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[11]  Minos N. Garofalakis,et al.  Leveraging Reconfigurable Computing in Distributed Real-time Computation Systems , 2016, EDBT/ICDT Workshops.

[12]  Asterios Katsifodimos,et al.  Apache Flink: Stream Analytics at Scale , 2016, 2016 IEEE International Conference on Cloud Engineering Workshop (IC2EW).

[13]  Nico Janssens,et al.  CHive: Bandwidth Optimized Continuous Querying in Distributed Clouds , 2015, IEEE Transactions on Cloud Computing.

[14]  Qun Li,et al.  Fog Computing: Platform and Applications , 2015, 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).

[15]  Li-Shiuan Peh,et al.  MobiStreams: A Reliable Distributed Stream Processing System for Mobile Devices , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[16]  Vladimir Vlassov,et al.  SpanEdge: Towards Unifying Stream Processing over Central and Near-the-Edge Data Centers , 2016, 2016 IEEE/ACM Symposium on Edge Computing (SEC).

[17]  Henri E. Bal,et al.  Large Scale Stream Analytics Using a Resource-Constrained Edge , 2018, 2018 IEEE International Conference on Edge Computing (EDGE).

[18]  Radu Stoleru,et al.  Mobile storm: Distributed real-time stream processing for mobile clouds , 2015, 2015 IEEE 4th International Conference on Cloud Networking (CloudNet).

[19]  Dimitrios Soudris,et al.  A survey on reconfigurable accelerators for cloud computing , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

[20]  Fengbo Ren,et al.  Are FPGAs Suitable for Edge Computing? , 2018, HotEdge.

[21]  Ioannis Papaefstathiou,et al.  HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system , 2013, TACO.

[22]  Marcos Dias de Assunção,et al.  A Data Stream Processing Optimisation Framework for Edge Computing Applications , 2018, 2018 IEEE 21st International Symposium on Real-Time Distributed Computing (ISORC).

[23]  Paramvir Bahl,et al.  The Case for VM-Based Cloudlets in Mobile Computing , 2009, IEEE Pervasive Computing.

[24]  Jason Cong,et al.  Understanding Performance Differences of FPGAs and GPUs , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[25]  Jason Cong,et al.  When apache spark meets FPGAs: a case study for next-generation DNA sequencing acceleration , 2016, CloudCom 2016.

[26]  Chen Yang,et al.  F-MStorm: Feedback-Based Online Distributed Mobile Stream Processing , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).

[27]  Sherif Sakr,et al.  Business Process Analytics and Big Data Systems: A Roadmap to Bridge the Gap , 2018, IEEE Access.

[28]  Russell Tessier,et al.  FPGA Architecture: Survey and Challenges , 2008, Found. Trends Electron. Des. Autom..

[29]  Nicolas Hidalgo,et al.  Symbiosis: Sharing mobile resources for stream processing , 2014, 2014 IEEE Symposium on Computers and Communications (ISCC).

[30]  Marianne Winslett,et al.  HaaS: Cloud-Based Real-Time Data Analytics with Heterogeneity-Aware Scheduling , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[31]  Prem Prakash Jayaraman,et al.  RedEdge: A Novel Architecture for Big Data Processing in Mobile Edge Computing Environments , 2017, J. Sens. Actuator Networks.

[32]  Jason Cong,et al.  Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale , 2016, SoCC.

[33]  Shinsuke Hara,et al.  Implementation of dynamic-range enhancement and super-resolution algorithms for medical image processing , 2014, 2014 IEEE International Conference on Consumer Electronics (ICCE).

[34]  Hiroki Matsutani,et al.  An FPGA-based low-latency network processing for spark streaming , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[35]  Yu Cao,et al.  Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.

[36]  Paul Chow,et al.  Accelerating Apache Spark with FPGAs , 2019, Concurr. Comput. Pract. Exp..

[37]  Manish Parashar,et al.  Data-Driven Stream Processing at the Edge , 2017, 2017 IEEE 1st International Conference on Fog and Edge Computing (ICFEC).

[38]  Kaiwen Zhang,et al.  Hardware Acceleration Landscape for Distributed Real-Time Analytics: Virtues and Limitations , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[39]  Mohammed A. S. Khalid,et al.  Acceleration of k-Means Algorithm Using Altera SDK for OpenCL , 2016, ACM Trans. Reconfigurable Technol. Syst..

[40]  Laurent Lefèvre,et al.  Latency-Aware Placement of Data Stream Analytics on Edge Computing , 2018, ICSOC.

[41]  Houman Homayoun,et al.  Accelerating Machine Learning Kernel in Hadoop Using FPGAs , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[42]  Weisong Shi,et al.  LAVEA: latency-aware video analytics on edge computing platform , 2017, SEC.

[43]  Jang HoChoi,et al.  DART: Fast and Efficient Distributed Stream Processing Framework for Internet of Things , 2017 .

[44]  Alvin AuYoung,et al.  Presto: distributed machine learning and graph processing with sparse matrices , 2013, EuroSys '13.

[45]  Yu Wang,et al.  ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture , 2017, FPGA.

[46]  Paramvir Bahl,et al.  Real-Time Video Analytics: The Killer App for Edge Computing , 2017, Computer.