An analytical model to evaluate reliability of cloud computing systems in the presence of QoS requirements

Cloud computing is widely referred as the next generation of computing systems. Reliability is a key metric for assessing performance in such systems. Redundancy and diversity are prevalent approaches to enhance reliability in Cloud Computing Systems (CCS). Proper resource allocation is an alternative approach to reliability improvement in such systems. In contrast to redundancy, appropriate resource allocation can improve system reliability without imposing extra cost. On the other hand, contemplating reliability irrespective of Quality of Service (QoS) requirements may be undesirable in most of CCSs. In this paper, we focus on resource allocation approach and introduce an analytical model in order to analyze system reliability besides considering application and resource constraints. Task precedence structure and QoS are taken into account as the application constraints. Memory and storage limitation of each server as well as maximum communication load on each link are considered as the principle resource constraints. In addition, effect of network topology on system reliability is discussed in detail and the model is extended to cover various network topologies.

[1]  Marc Sánchez Artigas,et al.  Towards the design of optimal data redundancy schemes for heterogeneous cloud storage infrastructures , 2011, Comput. Networks.

[2]  Dominique A. Heger,et al.  Optimized Resource Allocation & Task Scheduling Challenges in Cloud Computing Environments , 2011 .

[3]  J.-P. Wang,et al.  Task Allocation for Maximizing Reliability of Distributed Computer Systems , 1992, IEEE Trans. Computers.

[4]  Suresh Rai,et al.  Reliability Evaluation in Computer-Communication Networks , 1981, IEEE Transactions on Reliability.

[5]  Yi-Kuei Lin,et al.  Evaluation of system reliability for a cloud computing system with imperfect nodes , 2012, Syst. Eng..

[6]  Hamid Reza Faragardi,et al.  Allocation of Hard Real-time Periodic Tasks for Reliability Maximization in Distributed Systems , 2012, 2012 IEEE 15th International Conference on Computational Science and Engineering.

[7]  Nasser Yazdani,et al.  Reliability-Aware Task Allocation in Distributed Computing Systems using Hybrid Simulated Annealing and Tabu Search , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[8]  N. Yazdani,et al.  A new cat swarm optimization based algorithm for reliability-oriented task allocation in distributed systems , 2012, 6th International Symposium on Telecommunications (IST).

[9]  Kenneth Ward Church,et al.  On Delivering Embarrassingly Distributed Cloud Services , 2008, HotNets.

[10]  Mohammad Amin Keshtkar,et al.  Optimal task allocation for maximizing reliability in distributed real-time systems , 2013, 2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS).

[11]  K. Djemame,et al.  Towards Quality of Service in the Cloud , 2009 .

[12]  Yike Guo,et al.  Optimization of Resource Scheduling in Cloud Computing , 2010, 2010 12th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[13]  Colin J. Fidge,et al.  QoS-oriented Resource Allocation and Scheduling of Multiple CompositeWeb Services in a Hybrid Cloud Using a Random-Key Genetic Algorithm , 2010, Aust. J. Intell. Inf. Process. Syst..

[14]  Buqing Cao,et al.  A Service-Oriented Qos-Assured and Multi-Agent Cloud Computing Architecture , 2009, CloudCom.