FIM-SIM: Fault Injection Module for CloudSim Based on Statistical Distributions

The evolution of ICT systems in the way data is accessed and used is very fast nowadays. Cloud computing is an innovative way of using and providing computing resources to businesses and individuals and it has gained a faster popularity in the last years. In this context, the user’s expectations are increasing and cloud providers are facing huge challenges. One of these challenges is fault tolerance and both researchers and companies have focused on finding and developing strong fault tolerance models. To validate these models, cloud simulation tools are used as an easy, flexible and fast solution. This paper proposes a Fault Injector Module for CloudSim tool (FIM-SIM) for helping the cloud developers to test and validate their infrastructure. FIM-SIM follows the event-driven model and inserts faults in CloudSim based on statistical distributions. The authors have tested and validated it by conducting several experiments designed to highlight the statistical distribution influence on the failures generated and to observe the CloudSim behavior in its current state and implementation. Keywords—cloud simulation, continuous distributions, discrete distributions, fault injector.

[1]  Zibin Zheng,et al.  Component Ranking for Fault-Tolerant Cloud Applications , 2012, IEEE Transactions on Services Computing.

[2]  T. Minka,et al.  A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution , 2005 .

[3]  Yinong Chen,et al.  A simulation cloud monitoring framework and its evaluation model , 2013, Simul. Model. Pract. Theory.

[4]  Sergiy Vilkomir Cloud Testing: A State-of-the-Art Review , 2012 .

[5]  V. Piuri,et al.  Fault tolerance management in IaaS clouds , 2012, 2012 IEEE First AESS European Conference on Satellite Telecommunications (ESTEL).

[6]  Jesús Carretero,et al.  iCanCloud: A Flexible and Scalable Cloud Infrastructure Simulator , 2012, Journal of Grid Computing.

[7]  Ciprian Dobre,et al.  A fault tolerance approach for distributed systems using monitoring based replication , 2010, Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing.

[8]  Franck Cappello,et al.  Fault Tolerance in Petascale/ Exascale Systems: Current Knowledge, Challenges and Research Opportunities , 2009, Int. J. High Perform. Comput. Appl..

[9]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[10]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[11]  Peter Nijkamp,et al.  Accessibility of Cities in the Digital Economy , 2004, cond-mat/0412004.

[12]  Ciprian Dobre,et al.  Simulator for fault tolerance in large scale distributed systems , 2010, Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing.

[13]  Pabitra Mohan Khilar,et al.  VFT: A virtualization and fault tolerance approach for cloud computing , 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES.

[14]  Ciprian Dobre,et al.  A Fault-tolerant Approach to Storing Objects in Distributed Systems , 2010, 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.

[15]  Ahmad Mahir Razali,et al.  Mixture Weibull distributions for fitting failure times data , 2013, Appl. Math. Comput..

[16]  Sherali Zeadally,et al.  A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems , 2016, Computing.

[17]  Zibin Zheng,et al.  BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[18]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[19]  Fabrice Huet,et al.  Adaptive Fault Tolerance in Real Time Cloud Computing , 2011, 2011 IEEE World Congress on Services.

[20]  Kevin J. Lang Practical Algorithms for Generating a Random Ordering of the Elements of a Weighted Set , 2013, Theory of Computing Systems.

[21]  Liang Liu,et al.  GreenCloud: a new architecture for green data center , 2009, ICAC-INDST '09.

[22]  Ciprian Dobre,et al.  Fault Tolerance Using a Front-End Service for Large Scale Distributed Systems , 2009, 2009 11th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.