Resilience as a New System Engineering for Cloud Computing

It has become increasingly evident that large scale systems such as clouds can be brittle and may exhibit unpredictable behavior when faced with unexpected disturbances. Even weak and innocuous disturbances can bring down the system inoperative and may introduce catastrophic disasters to the society. The goal of this research is to explore the fundamental principles and theories that govern cloud system resilience and to provide novel and effective mechanisms to model and enhance the resilience of cloud. A food web like process interaction model is developed and system resilience enhancement mechanisms are proposed based on the control of the strength of interactions. Also, the effectiveness and limitations of modularization on resilience enhancement is illustrated by using a replica consistency control protocol. The research has shown that weakening key process interactions and modularizing complex systems are very effective on resilience enhancement. 

[1]  Prashant J. Shenoy,et al.  Empirical evaluation of latency-sensitive application performance in the cloud , 2010, MMSys '10.

[2]  Steven D. Gribble,et al.  Robustness in complex systems , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[3]  Dianxiang Xu,et al.  Data Placement in P2P Data Grids Considering the Availability, Security, Access Performance and Load Balancing , 2012, Journal of Grid Computing.

[4]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[5]  D. Arsenault Critical Thinking : Moving from Infrastructure Protection to Infrastructure Resilience , 2007 .

[6]  George Sugihara,et al.  Complex systems: Ecology for bankers , 2008, Nature.

[7]  Dustin Owens,et al.  Securing Elasticity in the Cloud , 2010, ACM Queue.

[8]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[9]  Gerard Briscoe,et al.  Digital ecosystems in the clouds: Towards community cloud computing , 2009, 2009 3rd IEEE International Conference on Digital Ecosystems and Technologies.

[10]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[11]  L. Hood,et al.  Reverse Engineering of Biological Complexity , 2007 .

[12]  H. Kitano Towards a theory of biological robustness , 2007, Molecular systems biology.

[13]  Dwight W. Read,et al.  SOME OBSERVATIONS ON RESILIENCE AND ROBUSTNESS IN HUMAN SYSTEMS , 2005, Cybern. Syst..

[14]  Dong Xuan,et al.  Analyzing and enhancing the resilience of structured peer-to-peer systems , 2005, J. Parallel Distributed Comput..

[15]  Mayada Omer,et al.  Measuring the resilience of the global internet infrastructure system , 2009, 2009 3rd Annual IEEE Systems Conference.

[16]  Scott Jackson Architecting Resilient Systems , 2009 .

[17]  J. Anderies,et al.  Robustness Trade-offs in Social-Ecological Systems , 2007 .

[18]  H Kitano,et al.  The theory of biological robustness and its implication in cancer. , 2007, Ernst Schering Research Foundation workshop.

[19]  John Kambhu,et al.  New Directions for Understanding Systemic Risk , 2007 .