The GriddLeS data replication service

The grid provides infrastructure that allows an arbitrary application to be executed on a range of different computational resources. When input files are very large, or when fault tolerance is important, the data may be replicated. Existing grid data replication middleware suffers from two shortcomings. First, it typically requires modification to existing applications. Second, there is limited support on automatic resource selection and a user usually chooses the replica manually to optimize the performance of the system. In this paper we discuss a middleware layer called the GriddLeS replication service (GRS) that sits above existing replication services, solving both of these shortcomings. Two case studies are presented that illustrate the effectiveness of the approach

[1]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[2]  Satoshi Matsuoka,et al.  Ninf-G: A Reference Implementation of RPC-based Programming Middleware for Grid Computing , 2003, Journal of Grid Computing.

[3]  David Abramson,et al.  An Atmospheric Sciences Workflow and Its Implementation with Web Services , 2004, International Conference on Computational Science.

[4]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[5]  David Abramson,et al.  High performance parametric modeling with Nimrod/G: killer application for the global grid? , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[6]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[7]  Douglas Thain,et al.  Bypass: a tool for building split execution systems , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[8]  Erwin Laure,et al.  Replica Management in Data Grids , 2002 .

[9]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[10]  Francine Berman,et al.  The GrADS Project: Software Support for High-Level Grid Application Development , 2001, Int. J. High Perform. Comput. Appl..

[11]  D. Cano,et al.  RAID-1 and Data Stripping across the GRID , 2003, European Across Grids Conference.

[12]  Heinz Stockinger,et al.  Grid Data Management Pilot (GDMP): A Tool for Wide Area Replication , 2001 .

[13]  Miron Livny,et al.  Condor: a distributed job scheduler , 2001 .

[14]  Federico Ruggieri The Datagrid Project , 2001 .

[15]  Péter Kacsuk,et al.  A Graphical Development and Debugging Environment for Parallel Programs , 1997, Parallel Comput..

[16]  David Abramson,et al.  A flexible IO scheme for grid workflows , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[17]  Víctor Robles,et al.  MAPFS: A Flexible Infrastructure for Data-Intensive Grid Applications , 2003 .

[18]  Douglas Thain,et al.  Parrot: Transparent User-Level Middleware for Data-Intensive Computing , 2005, Scalable Comput. Pract. Exp..

[19]  Yu Hu,et al.  GRESS - a Grid Replica Selection Service , 2003, ISCA PDCS.

[20]  Francine Berman,et al.  The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[21]  Ian T. Foster,et al.  Replica selection in the Globus Data Grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.