Utility-Based Replication Strategies in Data Grids

Providing fast, reliable and transparent access to data to all users within a community is one of the most crucial functions of data management in a grid environment. Replication strategies are regarded as one of the major optimization techniques for reducing access latency, improving data locality, and increasing robustness, scalability, and performance for the data grids. Compared with economy-based strategies implemented in OptorSim, utility-based strategies proposed in this paper, which make "buying" or replacement decision based on the utility model, achieve better performance in job execution cost and storage consumption with the same jobs being finished in three different type data grids. The simulation experiments also show that the larger the test data grid scale is, the more the job execution cost saved, which makes our method have a greater advantage in continuously growing data grids

[1]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[2]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[3]  Floriano Zini,et al.  Analysis of Scheduling and Replica Optimisation Strategies for Data Grids Using OptorSim , 2004, Journal of Grid Computing.

[4]  Flavia Donno,et al.  Replica Management in the European DataGrid Project , 2004, Journal of Grid Computing.

[5]  Luciano Serafini,et al.  Towards an Economy-Based Optimisation of File Access and Replication on a Data Grid , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[6]  Boleslaw K. Szymanski,et al.  Simulation of dynamic data replication strategies in Data Grids , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[7]  Tony Wildish,et al.  The Spring 2002 DAQ TDR production , 2002 .

[8]  Kurt Stockinger,et al.  Simulation of Dynamic Grid Replication Strategies in OptorSim , 2002, GRID.

[9]  Geoffrey C. Fox,et al.  Proceedings of the 4th international conference on Grid and Cooperative Computing , 2005 .

[10]  Min Cai,et al.  A Peer-to-Peer Replica Location Service Based on a Distributed Hash Table , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[11]  Floriano Zini,et al.  Evaluating scheduling and replica optimisation strategies in OptorSim , 2003, Proceedings. First Latin American Web Congress.

[12]  Luciano Serafini,et al.  Formal analysis of an agent-based optimisation strategy for Data Grids , 2006, Multiagent Grid Syst..

[13]  Kavitha Ranganathan,et al.  Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[14]  Kavitha Ranganathan,et al.  Identifying Dynamic Replication Strategies for a High-Performance Data Grid , 2001, GRID.

[15]  Floriano Zini,et al.  Evaluation of an economy-based file replication strategy for a data grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[16]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[17]  D. Bertolini,et al.  Centro Per La Ricerca , 2002 .

[18]  Wei Yang,et al.  Replica Location Mechanism Based on DHT and the Small-World Theory , 2004, GCC Workshops.