FTCloud: A Component Ranking Framework for Fault-Tolerant Cloud Applications

Cloud computing is becoming a mainstream aspect of information technology. The cloud applications are usually large-scale, complex, and include a lot of distributed components. Providing highly reliable cloud applications is a challenging and critical research problem. To attack this challenge, we propose FTCloud which is a component ranking based framework for building fault-tolerant cloud applications. FTCloud employs the component invocation structures and the invocation frequencies to identify the significant components in a cloud application. An algorithm is proposed to automatically determine optimal fault tolerance strategy for these significant components. The experimental results show that by tolerating faults of a small part of the most significant components, the reliability of cloud application can be greatly improved.

[1]  Lei Li,et al.  A Bayesian network based Qos assessment model for web services , 2007, IEEE International Conference on Services Computing (SCC 2007).

[2]  Brian Randell,et al.  The Evolution of the Recovery Block Concept , 1994 .

[3]  Raimundo José de Araújo Macêdo,et al.  An Adaptive Programming Model for Fault-Tolerant Distributed Computing , 2007, IEEE Transactions on Dependable and Secure Computing.

[4]  Priya Narasimhan,et al.  Thema: Byzantine-fault-tolerant middleware for Web-service applications , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).

[5]  Ricardo Jiménez-Peris,et al.  WS-replication: a framework for highly available web services , 2006, WWW '06.

[6]  Danilo Ardagna,et al.  Adaptive Service Composition in Flexible Processes , 2007, IEEE Transactions on Software Engineering.

[7]  Deron Liang,et al.  Fault tolerant Web service , 2003, Tenth Asia-Pacific Software Engineering Conference, 2003..

[8]  Zibin Zheng,et al.  A QoS-aware fault tolerant middleware for dependable service composition , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[9]  Zibin Zheng,et al.  A Distributed Replication Strategy Evaluation and Selection Framework for Fault Tolerant Web Services , 2008, 2008 IEEE International Conference on Web Services.

[10]  Swapna S. Gokhale,et al.  Reliability prediction and sensitivity analysis based on software architecture , 2002, 13th International Symposium on Software Reliability Engineering, 2002. Proceedings..

[11]  Deron Liang,et al.  A fault-tolerant object service on CORBA , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[12]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[13]  Zibin Zheng,et al.  Collaborative reliability prediction of service-oriented systems , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[14]  Michael R. Lyu,et al.  Handbook of software reliability engineering , 1996 .

[15]  Zibin Zheng,et al.  A QoS-Aware Middleware for Fault Tolerant Web Services , 2008, 2008 19th International Symposium on Software Reliability Engineering (ISSRE).

[16]  K. H. Kim,et al.  Distributed Execution of Recovery Blocks: An Approach for Uniform Treatment of Hardware and Software Faults in Real-Time Applications , 1989, IEEE Trans. Computers.

[17]  Jianhua Shao,et al.  A Quality of Service Management Framework Based on User Expectations , 2003, ICSOC.

[18]  Deron Liang,et al.  A fault-tolerant object service on CORBA , 1999, J. Syst. Softw..

[19]  Algirdas A. Avi The Methodology of N-Version Programming , 1995 .

[20]  Hany H. Ammar,et al.  Scenario-based reliability analysis of component-based software , 1999, Proceedings 10th International Symposium on Software Reliability Engineering (Cat. No.PR00443).

[21]  Zibin Zheng,et al.  WS-DREAM: A distributed reliability assessment Mechanism for Web Services , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[22]  Tao Yu,et al.  Efficient algorithms for Web services selection with end-to-end QoS constraints , 2007, TWEB.

[23]  Deron Liang,et al.  Fault tolerant Web Services , 2007, J. Syst. Archit..

[24]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[25]  Shinji Kusumoto,et al.  Ranking significance of software components based on use relations , 2003, IEEE Transactions on Software Engineering.

[26]  Albert Benveniste,et al.  Probabilistic QoS and Soft Contracts for Transaction-Based Web Services Orchestrations , 2008, IEEE Transactions on Services Computing.

[27]  Simon M. Kaplan,et al.  Scale-Free Nature of Java Software Package, Class and Method Collaboration Graphs , 2006 .

[28]  Lau Cheuk Lung,et al.  FTWeb: a fault tolerant infrastructure for Web services , 2005, Ninth IEEE International EDOC Enterprise Computing Conference (EDOC'05).

[29]  Wei-Tek Tsai,et al.  On Testing and Evaluating Service-Oriented Software , 2008, Computer.

[30]  Algirdas Avizienis,et al.  Software Fault Tolerance , 1989, IFIP Congress.

[31]  Anne H. H. Ngu,et al.  QoS-aware middleware for Web services composition , 2004, IEEE Transactions on Software Engineering.

[32]  Jean Arlat,et al.  Definition and analysis of hardware- and software-fault-tolerant architectures , 1990, Computer.

[33]  E. Michael Maximilien,et al.  Conceptual model of web service reputation , 2002, SGMD.

[34]  Wei Li,et al.  A framework to support survivable Web services , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[35]  Kenneth J. Goldman,et al.  Byzantine Fault-Tolerant Web Services for n-Tier and Service Oriented Architectures , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[36]  Nicolas Salatgé,et al.  Fault Tolerance Connectors for Unreliable Web Services , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[37]  Ying-Cheng Lai,et al.  Signatures of small-world and scale-free properties in large computer programs , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Vladimir Batagelj,et al.  Pajek - Program for Large Network Analysis , 1999 .