Fault-Tolerant Multiuser Computacional Grids based on Tuple Spaces

This paper proposes GridTS, a grid infrastructure in which the resources select the tasks they execute, instead of a scheduler finding resources for the tasks. This solution allows scheduling decisions to be made with up-to-date information about the resources. GridTS provides fault-tolerant scheduling by combining a set of fault tolerance techniques to tolerate crash faults in any components of the system. The communication is supported by a tuple space.

[1]  Richard D. Schlichting,et al.  Supporting Fault-Tolerant Parallel Programming in Linda , 1995, IEEE Trans. Parallel Distributed Syst..

[2]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[3]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[4]  Barbara Liskov,et al.  Guardians and Actions: Linguistic Support for Robust, Distributed Programs , 1983, TOPL.

[5]  Antony I. T. Rowstron,et al.  Solving the Linda Multiple rd Problem Using the Copy-Collect Primitive , 1998, Sci. Comput. Program..

[6]  Kenichi Hagihara,et al.  Near-optimal dynamic task scheduling of precedence constrained coarse-grained tasks onto a computational grid , 2003, Second International Symposium on Parallel and Distributed Computing, 2003. Proceedings..

[7]  Satish K. Tripathi,et al.  Static and Dynamic Processor Scheduling Disciplines in Heterogeneous Parallel Architectures , 1995, J. Parallel Distributed Comput..

[8]  Santosh K. Shrivastava,et al.  A System for Fault-Tolerance Execution of Data and Compute Intensive Programs over a Network of Workstations , 1996, Euro-Par, Vol. I.

[9]  Paul Groth,et al.  FT-Grid: A Fault-Tolerance System for e-Science , 2005 .

[10]  Rajkumar Buyya,et al.  GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing , 2002, Concurr. Comput. Pract. Exp..

[11]  Dennis Shasha,et al.  PLinda 2.0: a transactional/checkpointing approach to fault tolerant Linda , 1994, Proceedings of IEEE 13th Symposium on Reliable Distributed Systems.

[12]  Francisco Vilar Brasileiro,et al.  Trading Cycles for Information: Using Replication to Schedule Bag-of-Tasks Applications on Computational Grids , 2003, Euro-Par.

[13]  David Gelernter,et al.  Generative communication in Linda , 1985, TOPL.

[14]  Nazareno Andrade,et al.  OurGrid: An Approach to Easily Assemble Grids with Equitable Resource Sharing , 2003, JSSPP.

[15]  Barbara Liskov,et al.  A design for a fault-tolerant, distributed implementation of Linda , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[16]  Paul McKee,et al.  Dynamic Allocation of Servers to Jobs in a Grid Hosting Environment , 2004 .

[17]  Sheng-De Wang,et al.  Nature's heuristics for scheduling jobs on Computational Grids , 2000 .