Architectural Resilience in Cloud, Fog and Edge Systems: A Survey

An increasing number of large-scale distributed systems are being built by incorporating Cloud, Fog, and Edge computing. There is an important need of understanding how to ensure the resilience of systems built using Cloud, Fog, and Edge computing. This survey reports the state-of-the-art of architectural approaches that have been reported for ensuring the resilience of Cloud-, Fog- and Edge-based systems. This work reports a flexible taxonomy for reviewing architectural resilience approaches for distributed systems. In addition, this work also presents a capability-based cyber-foraging framework intended to improve the overall system resilience in the context of a physical node’s capabilities. This survey also highlights the trust-related issues and solutions in the context of system resilience and reliability. This survey will help improve the understanding of the current state of system resilience solutions and raise awareness about the issues related to physical capabilities and trust management in the context of distributed systems resilience.

[1]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[2]  Kemal A. Delic On Resilience of IoT Systems , 2016, Ubiquity.

[3]  Elaine Shi,et al.  Designing secure sensor networks , 2004, IEEE Wireless Communications.

[4]  Shi Qian,et al.  Evaluation of network resilience, survivability, and disruption tolerance: analysis, topology generation, simulation, and experimentation , 2013, Telecommun. Syst..

[5]  Soumya Simanta,et al.  Cloudlet-based cyber-foraging for mobile systems in resource-constrained edge environments , 2014, ICSE Companion.

[6]  Pan Hui,et al.  ThinkAir: Dynamic resource allocation and parallel execution in the cloud for mobile code offloading , 2012, 2012 Proceedings IEEE INFOCOM.

[7]  Ivan Stojmenovic,et al.  An overview of Fog computing and its security issues , 2016, Concurr. Comput. Pract. Exp..

[8]  Qun Li,et al.  Fog Computing: Platform and Applications , 2015, 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).

[9]  Kishor S. Trivedi,et al.  Network survivability performance evaluation:: a quantitative approach with applications in wireless ad-hoc networks , 2002, MSWiM '02.

[10]  James D. Herbsleb,et al.  Simplifying cyber foraging for mobile devices , 2007, MobiSys '07.

[11]  Grace A. Lewis,et al.  A catalog of architectural tactics for cyber-foraging , 2015, 2015 11th International ACM SIGSOFT Conference on Quality of Software Architectures (QoSA).

[12]  Xin Wang,et al.  Research of P2P architecture based on cloud computing , 2010, 2010 International Conference on Intelligent Computing and Integrated Systems.

[13]  Ricardo Neisse,et al.  A Model-Based Security Toolkit for the Internet of Things , 2014, 2014 Ninth International Conference on Availability, Reliability and Security.

[14]  Daniel A. Menascé,et al.  QoS Issues in Web Services , 2002, IEEE Internet Comput..

[15]  Jacob Beal,et al.  Engineering Resilient Collective Adaptive Systems by Self-Stabilisation , 2017, ACM Trans. Model. Comput. Simul..

[16]  Hammad Afzal,et al.  ARCA-IoT: An Attack-Resilient Cloud-Assisted IoT System , 2019, IEEE Access.

[17]  Lorenzo Keller,et al.  ConfErr: A tool for assessing resilience to human configuration errors , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[18]  Mon-Yen Luo,et al.  Realizing Fault Resilience in Web-Server Cluster , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[19]  Rajkumar Buyya,et al.  Emergent Failures: Rethinking Cloud Reliability at Scale , 2018, IEEE Cloud Computing.

[20]  David Hutchison,et al.  Strategies for Network Resilience: Capitalising on Policies , 2010, AIMS.

[21]  Peng Huang,et al.  Gray Failure: The Achilles' Heel of Cloud-Scale Systems , 2017, HotOS.

[22]  Soumya Simanta,et al.  Tactical Cloudlets: Moving Cloud Computing to the Edge , 2014, 2014 IEEE Military Communications Conference.

[23]  Douglas C. Schmidt,et al.  ROAR: A QoS-oriented modeling framework for automated cloud resource allocation and optimization , 2016, J. Syst. Softw..

[24]  Grace A. Lewis,et al.  Architectural tactics for cyber-foraging: Results of a systematic literature review , 2015, J. Syst. Softw..

[25]  Joy Bhattacharjee,et al.  A Survey on Cloud Computing Security, Challenges and Threats , 2011 .

[26]  Ian Lumb,et al.  A Taxonomy and Survey of Cloud Computing Systems , 2009, 2009 Fifth International Joint Conference on INC, IMS and IDC.

[27]  Chonho Lee,et al.  A survey of mobile cloud computing: architecture, applications, and approaches , 2013, Wirel. Commun. Mob. Comput..

[28]  Arun Venkataramani,et al.  Energy consumption in mobile phones: a measurement study and implications for network applications , 2009, IMC '09.

[29]  S. A. Timashev,et al.  Cyber Reliability, Resilience, and Safety of Physical Infrastructures , 2019, IOP Conference Series: Materials Science and Engineering.

[30]  Jing He,et al.  Security in Fog Computing through Encryption , 2016 .

[31]  Michael Luck,et al.  Transparent Fault Tolerance for Web Services Based Architectures , 2002, Euro-Par.

[32]  Kim-Kwang Raymond Choo,et al.  Distributed denial of service (DDoS) resilience in cloud: Review and conceptual cloud DDoS mitigation framework , 2016, J. Netw. Comput. Appl..

[33]  Shih-Hao Hung,et al.  Developing Collaborative Applications with Mobile Cloud - A Case Study of Speech Recognition , 2011, J. Internet Serv. Inf. Secur..

[34]  Lo'ai Tawalbeh,et al.  Resilience Mobile Cloud Computing: Features, Applications and Challenges , 2015, 2015 Fifth International Conference on e-Learning (econf).

[35]  Özalp Babaoglu,et al.  Design and implementation of a P2P Cloud system , 2012, SAC '12.

[36]  Ralph Deters,et al.  Architectural Designs from Mobile Cloud Computing to Ubiquitous Cloud Computing - Survey , 2014, 2014 IEEE World Congress on Services.

[37]  Kenji Leibnitz,et al.  Estimating Churn in Structured P2P Networks , 2007, ITC.

[38]  Rasim M. Alguliyev,et al.  Cyber-physical systems and their security issues , 2018, Comput. Ind..

[39]  Xinwen Zhang,et al.  Towards an Elastic Application Model for Augmenting the Computing Capabilities of Mobile Devices with Cloud Computing , 2011, Mob. Networks Appl..

[40]  Athanasios V. Vasilakos,et al.  A survey on trust management for Internet of Things , 2014, J. Netw. Comput. Appl..

[41]  Henri E. Bal,et al.  Cuckoo: A Computation Offloading Framework for Smartphones , 2010, MobiCASE.

[42]  Soumya Kanti Datta,et al.  Comparison of edge computing implementations: Fog computing, cloudlet and mobile edge computing , 2017, 2017 Global Internet of Things Summit (GIoTS).

[43]  David Hutchison,et al.  Achieving ICS Resilience and Security through Granular Data Flow Management , 2016, CPS-SPC '16.

[44]  Raja Lavanya,et al.  Fog Computing and Its Role in the Internet of Things , 2019, Advances in Computer and Electrical Engineering.

[45]  Kirit J. Modi,et al.  Cloud computing - concepts, architecture and challenges , 2012, 2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET).

[46]  Qun Li,et al.  Security and Privacy Issues of Fog Computing: A Survey , 2015, WASA.

[47]  Min Chen,et al.  On the computation offloading at ad hoc cloudlet: architecture and service modes , 2015, IEEE Communications Magazine.

[48]  Bharat K. Bhargava,et al.  An MTD-Based Self-Adaptive Resilience Approach for Cloud Systems , 2017, 2017 IEEE 10th International Conference on Cloud Computing (CLOUD).

[49]  Mahadev Satyanarayanan,et al.  The Role of Cloudlets in Hostile Environments , 2013, IEEE Pervasive Computing.

[50]  Rajkumar Buyya,et al.  Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges , 2013, IEEE Communications Surveys & Tutorials.

[51]  Shoji Kurakake,et al.  Roam, a seamless application framework , 2004, J. Syst. Softw..

[52]  Fenye Bao,et al.  Dynamic trust management for internet of things applications , 2012, Self-IoT '12.

[53]  Kyle Benson Enabling resilience in the Internet of Things , 2015, 2015 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).

[54]  Matthieu Roy,et al.  Architecting Resilient Computing Systems: a Component-Based Approach. (Conception et implémentation de systèmes résilients par une approche à composants) , 2013 .

[55]  Andrew S. Tanenbaum,et al.  Distributed systems: Principles and Paradigms , 2001 .

[56]  Mahmoud Al-Ayyoub,et al.  Resilient service provisioning in cloud based data centers , 2018, Future Gener. Comput. Syst..

[57]  Grace A. Lewis,et al.  Cyber-foraging for improving survivability of mobile systems , 2015, MILCOM 2015 - 2015 IEEE Military Communications Conference.

[58]  J. Sukarno Mertoguno,et al.  A physics‐based strategy for cyber resilience of CPS , 2019, Defense + Commercial Sensing.

[59]  Larry Rudolph A Virtualization Infrastructure that Supports Pervasive Computing , 2009, IEEE Pervasive Computing.

[60]  Cristina Alcaraz Cloud-Assisted Dynamic Resilience for Cyber-Physical Control Systems , 2018, IEEE Wireless Communications.

[61]  Yung-Hsiang Lu,et al.  Cloud Computing for Mobile Users: Can Offloading Computation Save Energy? , 2010, Computer.

[62]  Richard E. Ladner,et al.  Unequal loss protection: graceful degradation of image quality over packet erasure channels through forward error correction , 2000, IEEE Journal on Selected Areas in Communications.

[63]  Hema A. Murthy,et al.  Multi-level Network Resilience: Traffic Analysis, Anomaly Detection and Simulation , 2011 .

[64]  Peeter Laud,et al.  Verifiable Computation in Multiparty Protocols with Honest Majority , 2014, ProvSec.

[65]  Karsten Schwan,et al.  DataStager: scalable data staging services for petascale applications , 2009, HPDC '09.

[66]  Yue Liu,et al.  Service Oriented Resilience Strategy for Cloud Data Center , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C).

[67]  Wenbing Zhao,et al.  BFT-WS: A Byzantine Fault Tolerance Framework for Web Services , 2007, 2007 Eleventh International IEEE EDOC Conference Workshop.

[68]  David Hutchison,et al.  A survey of cyber security management in industrial control systems , 2015, Int. J. Crit. Infrastructure Prot..

[69]  Herbert Bos,et al.  Highly resilient peer-to-peer botnets are here: An analysis of Gameover Zeus , 2013, 2013 8th International Conference on Malicious and Unwanted Software: "The Americas" (MALWARE).

[70]  Simon Oechsner,et al.  A framework for resilience management in the cloud , 2015, Elektrotech. Informationstechnik.

[71]  Bin Liu,et al.  A novel service deployment approach based on resilience metrics for service-oriented system , 2018, Personal and Ubiquitous Computing.

[72]  Stephen Hailes,et al.  A distributed trust model , 1998, NSPW '97.

[73]  Igor Linkov,et al.  Fundamental Concepts of Cyber Resilience: Introduction and Overview , 2018, Cyber Resilience of Systems and Networks.

[74]  Jun Bi,et al.  Tripod: Towards a Scalable, Efficient and Resilient Cloud Gateway , 2019, IEEE Journal on Selected Areas in Communications.

[75]  Diego Gambetta Can We Trust Trust , 2000 .

[76]  Siani Pearson,et al.  Privacy, Security and Trust Issues Arising from Cloud Computing , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[77]  Leonardo Mariani,et al.  Localizing Faults in Cloud Systems , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).

[78]  Grace A. Lewis,et al.  Software Architecture Strategies for Cyber-Foraging Systems , 2016 .

[79]  Cho-Li Wang,et al.  Dynamic Optimization of Multiattribute Resource Allocation in Self-Organizing Clouds , 2013, IEEE Transactions on Parallel and Distributed Systems.

[80]  Bjarne E. Helvik,et al.  A survey of resilience differentiation frameworks in communication networks , 2007, IEEE Communications Surveys & Tutorials.

[81]  Jean-Luc Gaudiot,et al.  Network Resilience: A Measure of Network Fault Tolerance , 1990, IEEE Trans. Computers.

[82]  Mahadev Satyanarayanan,et al.  The case for cyber foraging , 2002, EW 10.

[83]  Yunhao Liu,et al.  Pseudo Trust: Zero-Knowledge Authentication in Anonymous P2Ps , 2008, IEEE Transactions on Parallel and Distributed Systems.

[84]  Mahadev Satyanarayanan,et al.  Data Staging on Untrusted Surrogates , 2003, FAST.

[85]  Mahmut T. Kandemir,et al.  Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines , 2011, 2011 31st International Conference on Distributed Computing Systems.

[86]  Grigore Albeanu,et al.  Software reliability in the fog computing , 2017, 2017 International Conference on Innovations in Electrical Engineering and Computational Technologies (ICIEECT).

[87]  M. Menth,et al.  Network resilience through multi-topology routing , 2005, DRCN 2005). Proceedings.5th International Workshop on Design of Reliable Communication Networks, 2005..

[88]  Shih-Hao Hung,et al.  Migrating Android Applications to the Cloud , 2011, Int. J. Grid High Perform. Comput..

[89]  Teruo Higashino,et al.  Edge-centric Computing: Vision and Challenges , 2015, CCRV.

[90]  Luis Rodero-Merino,et al.  Finding your Way in the Fog: Towards a Comprehensive Definition of Fog Computing , 2014, CCRV.

[91]  Weisong Shi,et al.  Edge Computing: Vision and Challenges , 2016, IEEE Internet of Things Journal.

[92]  Mario Gerla,et al.  PowerSense: power aware dengue diagnosis on mobile phones , 2011, mHealthSys '11.

[93]  Ashish Tiwari,et al.  Resilient Control and Safety for Cyber-Physical Systems , 2018, 2018 IEEE Workshop on Monitoring and Testing of Cyber-Physical Systems (MT-CPS).

[94]  Rodrigo Roman,et al.  Mobile Edge Computing, Fog et al.: A Survey and Analysis of Security Threats and Challenges , 2016, Future Gener. Comput. Syst..

[95]  John Carter,et al.  A lightweight secure cyber foraging infrastructure for resource-constrained devices , 2004, Sixth IEEE Workshop on Mobile Computing Systems and Applications.

[96]  Wenye Wang,et al.  Resilience of IoT Systems Against Edge-Induced Cascade-of-Failures: A Networking Perspective , 2019, IEEE Internet of Things Journal.

[97]  Chao Wang,et al.  Optimizing center performance through coordinated data staging, scheduling and recovery , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[98]  Michael Till Beck,et al.  Mobile Edge Computing: A Taxonomy , 2014 .

[99]  Zuzana Priscakova,et al.  Model of solutions for data security in Cloud Computing , 2013, ArXiv.

[100]  Ivan Stojmenovic,et al.  The Fog computing paradigm: Scenarios and security issues , 2014, 2014 Federated Conference on Computer Science and Information Systems.

[101]  Nalini Venkatasubramanian,et al.  Ride: A Resilient IoT Data Exchange Middleware Leveraging SDN and Edge Cloud Resources , 2018, 2018 IEEE/ACM Third International Conference on Internet-of-Things Design and Implementation (IoTDI).

[102]  David Hutchison,et al.  The Extended Cloud: Review and Analysis of Mobile Edge Computing and Fog From a Security and Resilience Perspective , 2017, IEEE Journal on Selected Areas in Communications.

[103]  J. Wenny Rahayu,et al.  Mobile cloud computing: A survey , 2013, Future Gener. Comput. Syst..

[104]  Mojtaba Alizadeh,et al.  Authentication in mobile cloud computing: A survey , 2016, J. Netw. Comput. Appl..

[105]  Danial Senejohnny,et al.  Resilience against Misbehaving Nodes in Asynchronous Networks , 2018, Automatica.

[106]  Luigi Alfredo Grieco,et al.  Security, privacy and trust in Internet of Things: The road ahead , 2015, Comput. Networks.

[107]  Haoyu Wang,et al.  Approaches for Resilience against Cascading Failures in Cloud Datacenters , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[108]  Sean Bechhofer,et al.  OWL: Web Ontology Language , 2009, Encyclopedia of Database Systems.

[109]  Keith W. Ross,et al.  A Measurement Study of a Large-Scale P2P IPTV System , 2007, IEEE Transactions on Multimedia.

[110]  Maged Hamada Ibrahim,et al.  Octopus: An Edge-fog Mutual Authentication Scheme , 2016, Int. J. Netw. Secur..

[111]  Charles P. Shelton,et al.  A framework for scalable analysis and design of system-wide graceful degradation in distributed embedded systems , 2003, Proceedings of the Eighth International Workshop on Object-Oriented Real-Time Dependable Systems, 2003. (WORDS 2003)..

[112]  Ernesto Damiani,et al.  A reputation-based approach for choosing reliable resources in peer-to-peer networks , 2002, CCS '02.

[113]  David Hutchison,et al.  A framework for the design and evaluation of network resilience management , 2012, 2012 IEEE Network Operations and Management Symposium.

[114]  Sparsh Mittal,et al.  A survey of techniques for improving error-resilience of DRAM , 2018, J. Syst. Archit..

[115]  Singh Hada Priyank,et al.  Security Agents: A Mobile Agent Based Trust Model for Cloud Computing , 2011 .

[116]  Xiaoyan Sun,et al.  Toward Cyberresiliency in the Context of Cloud Computing [Resilient Security] , 2018, IEEE Security & Privacy.

[117]  Craig Gentry,et al.  Pinocchio: Nearly Practical Verifiable Computation , 2013, 2013 IEEE Symposium on Security and Privacy.

[118]  Mukesh Singhal,et al.  Trust Management in Distributed Systems , 2007, Computer.

[119]  Eli Tilevich,et al.  Reducing the Energy Consumption of Mobile Applications Behind the Scenes , 2013, 2013 IEEE International Conference on Software Maintenance.

[120]  Cho-Li Wang,et al.  Error-Tolerant Resource Allocation and Payment Minimization for Cloud System , 2013, IEEE Transactions on Parallel and Distributed Systems.

[121]  Schahram Dustdar,et al.  Dynamic replication and synchronization of web services for high availability in mobile ad-hoc networks , 2007, Service Oriented Computing and Applications.

[122]  Miodrag Potkonjak,et al.  mHealthMon: Toward Energy-Efficient and Distributed Mobile Health Monitoring Using Parallel Offloading , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[123]  Zenon Chaczko,et al.  A review on Fog Computing technology , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[124]  David Hutchison,et al.  Resilience and survivability in communication networks: Strategies, principles, and survey of disciplines , 2010, Comput. Networks.

[125]  Kim-Kwang Raymond Choo,et al.  Resilient interconnection in cyber-physical control systems , 2017, Comput. Secur..

[126]  Ani Bicaku,et al.  Towards Resilience Metrics for Future Cloud Applications , 2016, CLOSER.

[127]  Jean Vanderdonckt,et al.  Graceful degradation of user interfaces as a design method for multiplatform systems , 2004, IUI '04.

[128]  Arun Kumar Yadav,et al.  Real Time Efficient Scheduling Algorithm for Load Balancing in Fog Computing Environment , 2016 .

[129]  Ben Y. Zhao,et al.  AmazingStore: available, low-cost online storage service using cloudlets , 2010, IPTPS.

[130]  Bernhard Plattner,et al.  Network resilience: a systematic approach , 2011, IEEE Communications Magazine.

[131]  Yogesh L. Simmhan,et al.  GrayWulf: Scalable Software Architecture for Data Intensive Computing , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[132]  Miguel Correia,et al.  RockFS: Cloud-backed File System Resilience to Client-Side Attacks , 2018, Middleware.

[133]  Craig Gentry,et al.  Non-interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers , 2010, CRYPTO.

[134]  Qun Li,et al.  A Survey of Fog Computing: Concepts, Applications and Issues , 2015, Mobidata@MobiHoc.

[135]  Osman Ghazali,et al.  Fog Computing: Will it be the Future of Cloud Computing? , 2014 .

[136]  Paramvir Bahl,et al.  The Case for VM-Based Cloudlets in Mobile Computing , 2009, IEEE Pervasive Computing.