Optimal resource allocation on grid systems for maximizing service reliability using a genetic algorithm

Grid computing system is different from conventional distributed computing systems by its focus on large-scale resource sharing and open architecture for services. The global grid technologies and the Globus Toolkit in particular, are evolving toward an open grid service architecture (OGSA) with which a grid system provides an extensible infrastructure so that various organizations can offer their own services and integrate their resources. Hence, this paper aims at solving the problem of optimally allocating services on the grid to maximize the grid service reliability. Since no existing study has analyzed the grid service reliability, this paper develops initial modeling and evaluation algorithms to evaluate the grid service reliability. Based on the grid service reliability evaluation, we present an optimization model for the grid service allocation problem and develop a genetic algorithm (GA) to effectively solve it. A numerical example is given to show the modeling procedures and efficiency of the GAs.

[1]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[2]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[3]  Ruey-Shun Chen,et al.  A heuristic approach to generating file spanning trees for reliability analysis of distributed computing systems , 1997 .

[4]  Deng-Jyi Chen,et al.  cient algorithms for reliability analysis of distributed computing systems , 1999 .

[5]  Dharma P. Agrawal,et al.  A generalized algorithm for evaluating distributed-program reliability , 1993 .

[6]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[7]  Gregory Levitin,et al.  Optimizing survivability of multi-state systems with multi-level protection by multi-processor genetic algorithm , 2003, Reliab. Eng. Syst. Saf..

[8]  Min Xie,et al.  A study of operational and testing reliability in software reliability analysis , 2000, Reliab. Eng. Syst. Saf..

[9]  Yuan-Shun Dai,et al.  Computing systems reliability - models and analysis , 2004 .

[10]  Rajkumar Buyya,et al.  A taxonomy and survey of grid resource management systems for distributed computing , 2002, Softw. Pract. Exp..

[11]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[12]  Viktor K. Prasanna,et al.  Distributed program reliability analysis , 1986, IEEE Transactions on Software Engineering.

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  Yuan-Shun Dai,et al.  A study of service reliability and availability for distributed systems , 2003, Reliab. Eng. Syst. Saf..

[15]  Sajal K. Das,et al.  MinEX: a latency-tolerant dynamic partitioner for grid computing applications , 2002, Future Gener. Comput. Syst..

[16]  David W. Coit,et al.  Reliability optimization of series-parallel systems using a genetic algorithm , 1996, IEEE Trans. Reliab..

[17]  Deng-Jyi Chen,et al.  The distributed program reliability analysis on ring-type topologies , 2001, Comput. Oper. Res..

[18]  Yash P. Gupta,et al.  Genetic-algorithm-based reliability optimization for computer network expansion , 1995 .

[19]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[20]  Ian T. Foster,et al.  Grid Services for Distributed System Integration , 2002, Computer.

[21]  Yuan-Shun Dai,et al.  A model for availability analysis of distributed software/hardware systems , 2002, Inf. Softw. Technol..

[22]  Deng-Jyi Chen,et al.  Reliability Analysis of Distributed Systems Based on a Fast Reliability Algorithm , 1992, IEEE Trans. Parallel Distributed Syst..

[23]  Dharma P. Agrawal,et al.  On computer communication network reliability under program execution constraints , 1988, IEEE J. Sel. Areas Commun..

[24]  Zhao Bingquan,et al.  Application of genetic algorithms to fault diagnosis in nuclear power plants , 2000 .

[25]  Yuan-Shun Dai,et al.  Reliability analysis of grid computing systems , 2002, 2002 Pacific Rim International Symposium on Dependable Computing, 2002. Proceedings..

[26]  Rajesh Raman,et al.  High-throughput resource management , 1998 .

[27]  Szu Hui Ng,et al.  A model for correlated failures in N-version programming , 2004 .

[28]  Enrico Zio,et al.  Designing optimal degradation tests via multi-objective genetic algorithms , 2003, Reliab. Eng. Syst. Saf..