Distributed Operator Placement for IoT Data Analytics Across Edge and Cloud Resources

The number of Internet of Things applications is forecast to grow exponentially within the coming decade. Owners of such applications strive to make predictions from large streams of complex input in near real time. Cloud-based architectures often centralize storage and processing, generating high data movement overheads that penalize real-time applications. Edge and Cloud architecture pushes computation closer to where the data is generated, reducing the cost of data movements and improving the application response time. The heterogeneity among the edge devices and cloud servers introduces an important challenge for deciding how to split and orchestrate the IoT applications across the edge and the cloud. In this paper, we extend our IoT Edge Framework, called R-Pulsar, to propose a solution on how to split IoT applications dynamically across the edge and the cloud, allowing us to improve performance metrics such as end-to-end latency (response time), bandwidth consumption, and edge-to-cloud and cloud-to-edge messaging cost. Our approach consists of a programming model and real-world implementation of an IoT application. The results show that our approach can minimize the end-to-end latency by at least 38% by pushing part of the IoT application to the edge. Meanwhile, the edge-to-cloud data transfers are reduced by at least 38% and the messaging costs are reduced by at least 50% when using the existing commercial edge cloud cost models.

[1]  Manish Parashar,et al.  Data-Driven Stream Processing at the Edge , 2017, 2017 IEEE 1st International Conference on Fog and Edge Computing (ICFEC).

[2]  Valeria Cardellini,et al.  Optimal operator deployment and replication for elastic distributed data stream processing , 2018, Concurr. Comput. Pract. Exp..

[3]  Mahadev Satyanarayanan,et al.  The Impact of Mobile Multimedia Applications on Data Center Consolidation , 2013, 2013 IEEE International Conference on Cloud Engineering (IC2E).

[4]  Ching-Lai Hwang,et al.  Multiple attribute decision making : an introduction , 1995 .

[5]  Laurent Lefèvre,et al.  Latency-Aware Placement of Data Stream Analytics on Edge Computing , 2018, ICSOC.

[6]  Mohammad Hosseini,et al.  R-Storm: Resource-Aware Scheduling in Storm , 2015, Middleware.

[7]  Daniel Balouek-Thomert,et al.  Edge Based Data-Driven Pipelines (Technical Report) , 2018, ArXiv.

[8]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.

[9]  Berndt Brehmer,et al.  The Dynamic OODA Loop : Amalgamating Boyd ’ s OODA Loop and the Cybernetic Approach to Command and Control ASSESSMENT , TOOLS AND METRICS , 2005 .

[10]  Thomas Locher,et al.  Task allocation for distributed stream processing , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[11]  Alan Davy,et al.  Resource aware placement of IoT application modules in Fog-Cloud Computing Paradigm , 2017, 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM).

[12]  Rajkumar Buyya,et al.  Distributed data stream processing and edge computing: A survey on resource elasticity and future directions , 2017, J. Netw. Comput. Appl..

[13]  Nancy Samaan,et al.  Cloud Resource Scaling for Big Data Streaming Applications Using a Layered Multi-dimensional Hidden Markov Model , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[14]  Jie Gong,et al.  Online Decision-Making Using Edge Resources for Content-Driven Stream Processing , 2017, 2017 IEEE 13th International Conference on e-Science (e-Science).

[15]  Jian Tang,et al.  T-Storm: Traffic-Aware Online Scheduling in Storm , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[16]  Ying Gao,et al.  Quantifying the Impact of Edge Computing on Mobile Applications , 2016, APSys.

[17]  Schahram Dustdar,et al.  VISP: An Ecosystem for Elastic Data Stream Processing for the Internet of Things , 2016, 2016 IEEE 20th International Enterprise Distributed Object Computing Conference (EDOC).

[18]  Vincenzo Grassi,et al.  Optimal operator placement for distributed stream processing applications , 2016, DEBS.

[19]  Zhenni Li,et al.  Cost-Aware Streaming Workflow Allocation on Geo-Distributed Data Centers , 2017, IEEE Transactions on Computers.

[20]  Sandeep K. Sood,et al.  Efficient Resource Management System Based on 4Vs of Big Data Streams , 2017, Big Data Res..

[21]  Vladimir Vlassov,et al.  SpanEdge: Towards Unifying Stream Processing over Central and Near-the-Edge Data Centers , 2016, 2016 IEEE/ACM Symposium on Edge Computing (SEC).

[22]  Manish Parashar,et al.  Computing in the Continuum: Combining Pervasive Devices and Services to Support Data-Driven Applications , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[23]  Chungang Yan,et al.  Resource Allocation Strategy in Fog Computing Based on Priced Timed Petri Nets , 2017, IEEE Internet of Things Journal.

[24]  Yogesh L. Simmhan,et al.  RIoTBench: An IoT benchmark for distributed stream processing systems , 2017, Concurr. Comput. Pract. Exp..

[25]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[26]  Anne Benoit,et al.  Scheduling linear chain streaming applications on heterogeneous systems with failures , 2013, Future Gener. Comput. Syst..

[27]  Vincenzo Grassi,et al.  Distributed QoS-aware scheduling in storm , 2015, DEBS.

[28]  Bin Cheng,et al.  Geelytics: Enabling On-Demand Edge Analytics over Scoped Data Sources , 2016, 2016 IEEE International Congress on Big Data (BigData Congress).