Cost-effective replication management and scheduling in edge computing

Abstract The high volumes of data are continuously generated from Internet of Things (IoT) sensors in an industrial landscape. Especially, the data-intensive workflows from IoT systems require to be processed in a real-time, reliable and low-cost way. Edge computing can provide a low-latency and cost-effective computing paradigm to deploy workflows. Therefore, data replication management and scheduling for delay-sensitive workflows in edge computing have become challenge research issues. In this work, first, we propose a replication management system which includes dynamic replication creator, a specialized cost-effective scheduler for data placement, a system watcher and some data security tools for collaborative edge and cloud computing systems. And then, considering task dependency, data reliability and sharing, the data scheduling for the workflows is modeled as an integer programming problem. And we present the faster meta-heuristic algorithm to solve it. The experimental results show that our algorithms can achieve much better system performance than comparative traditional strategies, and they can create a suitable number of data copies and search the higher quality replica placement solution while reducing the total data access costs under the deadline constraint.

[1]  Youlong Luo,et al.  Mobile user behavior based topology formation and optimization in ad hoc mobile cloud , 2019, J. Syst. Softw..

[2]  Tang Jianhang,et al.  Joint optimization of data placement and scheduling for improving user experience in edge computing , 2019, J. Parallel Distributed Comput..

[3]  Antonio Pescapè,et al.  Integration of Cloud computing and Internet of Things: A survey , 2016, Future Gener. Comput. Syst..

[4]  Shabbir N. Merchant,et al.  An improved multicast based energy efficient opportunistic data scheduling algorithm for VANET , 2018 .

[5]  Miron Livny,et al.  A framework for reliable and efficient data placement in distributed computing systems , 2005, J. Parallel Distributed Comput..

[6]  Martin Maier,et al.  Workflow Scheduling in Multi-Tenant Cloud Computing Environments , 2017, IEEE Transactions on Parallel and Distributed Systems.

[7]  Mohd Bazli Ab Karim,et al.  Extending Cloud Resources to the Edge: Possible Scenarios, Challenges, and Experiments , 2016, 2016 International Conference on Cloud Computing Research and Innovations (ICCCRI).

[8]  Bhaskar Prasad Rimal,et al.  A Framework of Scientific Workflow Management Systems for Multi-tenant Cloud Orchestration Environment , 2010, 2010 19th IEEE International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises.

[9]  Weisong Shi,et al.  Edge Computing: Vision and Challenges , 2016, IEEE Internet of Things Journal.

[10]  Antonio Pescapè,et al.  Measuring network throughput in the cloud: The case of Amazon EC2 , 2015, Comput. Networks.

[11]  Youlong Luo,et al.  Dynamic resource allocation strategy for latency-critical and computation-intensive applications in cloud-edge environment , 2019, Comput. Commun..

[12]  Giil Kwon,et al.  Conceptual design of new data integration and process system for KSTAR data scheduling , 2018 .

[13]  Masato Oguchi,et al.  A Study of Effective Replica Reconstruction Schemes at Node Deletion for HDFS , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[14]  Hari Balakrishnan,et al.  Building Web Applications on Top of Encrypted Data Using Mylar , 2014, NSDI.

[15]  Salvatore Marano,et al.  Meta-Heuristics Methods for a NP-Complete Networking Problem , 2008, 2008 IEEE 68th Vehicular Technology Conference.

[16]  Chunlin Li,et al.  Real-time scheduling based on optimized topology and communication traffic in distributed real-time computation platform of storm , 2017, J. Netw. Comput. Appl..

[17]  Meina Song,et al.  Dynamic Scheduling of Workflow for Makespan and Robustness Improvement in the IaaS Cloud , 2017, IEICE Trans. Inf. Syst..

[18]  Atul Negi,et al.  A data locality based scheduler to enhance MapReduce performance in heterogeneous environments , 2019, Future Gener. Comput. Syst..

[19]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[20]  Daniel S. Katz,et al.  Optimizing workflow data footprint , 2007, Sci. Program..

[21]  Peter C. J. Graham,et al.  Distributed Placement of Replicas in Hierarchical Data Grids with User and System QoS Constraints , 2011, 2011 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.

[22]  Amir Masoud Rahmani,et al.  A data replication algorithm for groups of files in data grids , 2018, J. Parallel Distributed Comput..

[23]  Lakshmish Ramaswamy,et al.  Cooperative Data Placement and Replication in Edge Cache Networks , 2006, 2006 International Conference on Collaborative Computing: Networking, Applications and Worksharing.

[24]  Hui Li,et al.  A Genetic Algorithm Based Data Replica Placement Strategy for Scientific Applications in Clouds , 2018, IEEE Transactions on Services Computing.

[25]  Yufeng Wang,et al.  A novel ITÖ Algorithm for influence maximization in the large-scale social networks , 2018, Future Gener. Comput. Syst..

[26]  Mahadev Satyanarayanan,et al.  The Emergence of Edge Computing , 2017, Computer.

[27]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[28]  Dong Wen Convergence and Runtime Analysis of ITO Algorithm for One Class of Combinatorial Optimization , 2011 .

[29]  Jesús Montes,et al.  Towards Efficient Location and Placement of Dynamic Replicas for Geo-Distributed Data Stores , 2016, ScienceCloud@HPDC.

[30]  Daniel Grosu,et al.  A Distributed Algorithm for the Replica Placement Problem , 2011, IEEE Transactions on Parallel and Distributed Systems.

[31]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[32]  Sarbjeet Singh,et al.  A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centers , 2016, Future Gener. Comput. Syst..

[33]  Tao Xie,et al.  FIRE: A File Reunion Based Data Replication Strategy for Data Grids , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[34]  Erzhou Zhu,et al.  A New Particle Swarm Optimization-Based Strategy for Cost-Effective Data Placement in Scientific Cloud Workflows , 2014 .

[35]  Antonio Pescapè,et al.  On the performance of the wide-area networks interconnecting public-cloud datacenters around the globe , 2017, Comput. Networks.

[36]  Ayaz Isazadeh,et al.  PHFS: A dynamic replication method, to decrease access latency in the multi-tier data grid , 2011, Future Gener. Comput. Syst..

[37]  M. Anwar Hossain,et al.  Edge computing framework for enabling situation awareness in IoT based smart city , 2018, J. Parallel Distributed Comput..

[38]  Md. Zia Uddin A wearable sensor-based activity prediction system to facilitate edge computing in smart healthcare system , 2019, J. Parallel Distributed Comput..