Model-Based Sensitivity of a Disaster Tolerant Active-Active GENESIS Cloud System

Modern cloud computing systems are prone to disasters. And the true cost due to service outages is reportedly huge. Some of previous works presented the use of hierarchical models: fault tree (FT), reliability block diagram (RBD) along with state-space models: continuous time Markov chain (CTMC) or stochastic petri nets (SPN) to assess the reliability/availability of cloud systems, but with much simplification. In this paper, we attempt to propose a combinatorial monolithic model using reliability graph (RG) for a real-world cloud system called general purpose integrated cloud system (GENESIS). The system is designed in active-active high availability configuration with two geographically distributed cloud sites for the sake of disaster tolerance (DT). We then present the model-based comprehensive analysis of system reliability/availability and their sensitivity. The results pinpoint different findings in which the architecture of active-active and geographically dispersed sites with appropriate interconnections of the cloud apparently enhance the system reliability/availability and assure disaster tolerance for the cloud.

[1]  Dave Clitherow,et al.  Combining high availability and disaster recovery solutions for critical IT environments , 2008, IBM Syst. J..

[2]  Kishor S. Trivedi,et al.  SHARPE at the age of twenty two , 2009, PERV.

[3]  Kishor S. Trivedi,et al.  Redundant Eucalyptus Private Clouds: Availability Modeling and Sensitivity Analysis , 2017, Journal of Grid Computing.

[4]  Kishor S. Trivedi,et al.  Availability analysis of blade server systems , 2008, IBM Syst. J..

[5]  Jong Sou Park,et al.  A Comprehensive Sensitivity Analysis of a Data Center Network with Server Virtualization for Business Continuity , 2015 .

[6]  Eli M. Dow,et al.  Leveraging virtualization to optimize high-availability system configurations , 2008, IBM Syst. J..

[7]  Paula Ta-Shma,et al.  Using virtualization for high availability and disaster recovery , 2009, IBM J. Res. Dev..

[8]  Bill Powell,et al.  IT service management for high availability , 2008, IBM Syst. J..

[9]  Paulo Romero Martins Maciel,et al.  Dependability models for designing disaster tolerant cloud computing systems , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[10]  Jürgen M. Schneider,et al.  From high availability and disaster recovery to business continuity solutions , 2008, IBM Syst. J..

[11]  Jin B. Hong,et al.  Availability Modeling and Analysis for Software Defined Networks , 2015, 2015 IEEE 21st Pacific Rim International Symposium on Dependable Computing (PRDC).

[12]  Masayuki Matsui,et al.  Fundamentals and Principles of Artifacts Science , 2016 .

[13]  Dong Seong Kim,et al.  Availability Modeling and Analysis of a Virtualized System Using Stochastic Reward Nets , 2016, 2016 IEEE International Conference on Computer and Information Technology (CIT).

[14]  Jamilson Dantas,et al.  Availability Evaluation and Sensitivity Analysis of a Mobile Backend‐as‐a‐service Platform , 2016, Qual. Reliab. Eng. Int..

[15]  Dong Seong Kim,et al.  System availability assessment using stochastic models , 2013 .

[16]  Dong Seong Kim,et al.  A Comprehensive Availability Modeling and Analysis of a Virtualized Servers System Using Stochastic Reward Nets , 2014, TheScientificWorldJournal.

[17]  Paulo Maciel,et al.  Dependability evaluation of cloud infrastructures , 2014, 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[18]  Dong Seong Kim,et al.  Availability modeling and analysis of a data center for disaster tolerance , 2016, Future Gener. Comput. Syst..

[19]  Armin Zimmermann,et al.  Survivability Evaluation of Disaster Tolerant Cloud Computing Systems , 2014 .

[20]  Gustavo A. A. Santana Data Center Virtualization Fundamentals: Understanding Techniques and Designs for Highly Efficient Data Centers with Cisco Nexus, UCS, MDS, and Beyond , 2013 .

[21]  Harriet Morrill,et al.  Achieving continuous availability of IBM systems infrastructures , 2008, IBM Syst. J..

[22]  Paulo Romero Martins Maciel,et al.  Performability models for designing disaster tolerant Infrastructure-as-a-Service cloud computing systems , 2013, 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013).

[23]  Jin B. Hong,et al.  Availability Modeling and Analysis of a Virtualized System , 2009, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing.

[24]  Paulo Romero Martins Maciel,et al.  Availability study on cloud computing environments: Live migration as a rejuvenation mechanism , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[25]  Dong Seong Kim,et al.  Modeling and analysis of software rejuvenation in a server virtualized system with live VM migration , 2013, Perform. Evaluation.

[26]  Gustavo Rau de Almeida Callou,et al.  Availability modeling and analysis of a disaster-recovery-as-a-service solution , 2017, Computing.