Dynamic replication algorithms for the multi-tier Data Grid

Data replication is a common method used to improve the performance of data access in distributed systems. In this paper, two dynamic replication algorithms, Simple Bottom-Up (SBU) and Aggregate Bottom-Up (ABU), are proposed for the multi-tier Data Grid. A multi-tier Data Grid simulator called DRepSim is developed for studying the performances of the dynamic replication algorithms. The simulation results show that both algorithms can reduce the average response time of data access greatly compared to the static replication method. ABU can achieve great performance improvements for all access patterns even if the available storage size of the replication server is very small. Comparing the two algorithms to Fast Spread dynamic replication strategy, ABU proves to be superior. As for SBU, although the average response time of Fast Spread is better in most cases, Fast Spread's replication frequency is too high to be applicable in the real world.

[1]  Satoshi Matsuoka,et al.  Performance analysis of scheduling and replication algorithms on Grid Datafarm architecture for high-energy physics applications , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[2]  Ian T. Foster,et al.  Locating Data in (Small-World?) Peer-to-Peer Scientific Collaborations , 2002, IPTPS.

[3]  Kavitha Ranganathan,et al.  Simulation Studies of Computation and Data Scheduling Algorithms for Data Grids , 2003, Journal of Grid Computing.

[4]  Kurt Stockinger,et al.  OptorSim-A Grid Simulator for Studying Dynamic Data Replication Strategies , 2003 .

[5]  Boleslaw K. Szymanski,et al.  Simulation of dynamic data replication strategies in Data Grids , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[6]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[7]  Kavitha Ranganathan,et al.  Identifying Dynamic Replication Strategies for a High-Performance Data Grid , 2001, GRID.

[8]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[9]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[10]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[11]  William E. Johnston,et al.  The Computing and Data Grid Approach: Infrastructure for Distributed Science Applications , 2013, Comput. Artif. Intell..

[12]  Sushil Jajodia,et al.  An adaptive data replication algorithm , 1997, TODS.

[13]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[14]  Koen Holtman,et al.  CMS Data Grid System Overview and Requirements , 2001 .

[15]  Erich Schikuta,et al.  Towards a cost model for distributed and replicated data stores , 2001, Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing.

[16]  R. Rajaraman,et al.  Dynamic Replication on the Internet , 1998 .

[17]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..