A List-Based Strategy for Optimal Replica Placement in Data Grid Systems

Data replications is a typical strategy for improving access performance and data availability in data grid systems. Current works on data replication in grid systems focus on the infrastructure for data replication and the mechanism of replicas creation and deletion.The important problem of choosing suitable locations for placing replicas in data grids has not been fully studied. This paper addresses replica placement problem in data grids when given a sequence of priority lists that specify the forwarding policies for data requests. We propose the concept of priority list to address two issues. First, a user may have limited authority in accessing the resources, and thus his/her data requests should be prohibited from accessing some of the sites. Second, a static policy may not satisfy a data request with special requirements (e.g. quality of service requirement). In this priority-list-based model we propose a placement algorithm that finds optimal locations for replicas so that the workload among the replicas is balanced. We also propose an algorithm that determines the minimum number of replicas when the maximum workload capacity of each replica is given.

[1]  Carl Kesselman,et al.  Wide area data replication for scientific collaborations , 2005, Int. J. High Perform. Comput. Netw..

[2]  Brian Tierney,et al.  File and Object Replication in Data Grids , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[3]  Kavitha Ranganathan,et al.  Identifying Dynamic Replication Strategies for a High-Performance Data Grid , 2001, GRID.

[4]  Ouri Wolfson,et al.  The multicast policy and its relationship to replicated data placement , 1991, TODS.

[5]  Israel Cidon,et al.  Optimal Content Location in Multicast Based Overlay Networks with Content Updates , 2004, World Wide Web.

[6]  Kurt Stockinger,et al.  Simulation of Dynamic Grid Replication Strategies in OptorSim , 2002, GRID.

[7]  Jemal H. Abawajy,et al.  Placement of File Replicas in Data Grid Environments , 2004, International Conference on Computational Science.

[8]  Shay Kutten,et al.  Optimal allocation of electronic content , 2002, Comput. Networks.

[9]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[10]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[11]  Nian-Feng Tzeng,et al.  Resource Allocation in Cube Network Systems Based on the Covering Radius , 1996, IEEE Trans. Parallel Distributed Syst..

[12]  Floriano Zini,et al.  Evaluation of an economy-based file replication strategy for a data grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[13]  Jianliang Xu,et al.  QoS-aware replica placement for content distribution , 2005, IEEE Transactions on Parallel and Distributed Systems.

[14]  Rajmohan Rajaraman,et al.  Placement Algorithms for Hierarchical Cooperative Caching , 2001, J. Algorithms.

[15]  E. Deelman,et al.  Data replication strategies in grid environments , 2002, Fifth International Conference on Algorithms and Architectures for Parallel Processing, 2002. Proceedings..

[16]  Konstantinos Kalpakis,et al.  Optimal Placement of Replicas in Trees with Read, Write, and Storage Costs , 2001, IEEE Trans. Parallel Distributed Syst..

[17]  Jemal H. Abawajy,et al.  An efficient replicated data access approach for large-scale distributed systems , 2004, CCGRID.

[18]  Samir Khuller,et al.  Capacitated vertex covering , 2003, J. Algorithms.

[19]  Myung M. Bae,et al.  Resource placement in torus-based networks , 1996, Proceedings of International Conference on Parallel Processing.

[20]  Kavitha Ranganathan,et al.  Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[21]  Pangfeng Liu,et al.  Optimal Replica Placement in Data Grid Environments with Locality Assurance , 2006 .