Optimal placement of replicas in data grid environments with locality assurance

Data replication is a typical strategy for increasing access performance and data availability in data grid systems. Work on data replication in grid systems focuses on infrastructure for replication and mechanisms for creating/deleting replicas. The important problem of choosing suitable locations for placing replicas in data grids has not been well studied. In this paper, we address the problem of data replica placement in data grids given the traffic pattern and locality requirements. We propose a new placement algorithm that finds the optimal locations for the replicas so that the workload among these replicas is balanced. We also propose a new algorithm to decide the minimum number of replicas required when the maximum workload capacity of each replica server is known. All these algorithms ensure that locality requirements from the users are satisfied

[1]  Shay Kutten,et al.  Optimal allocation of electronic content , 2002, Comput. Networks.

[2]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[3]  Jemal H. Abawajy,et al.  An efficient replicated data access approach for large-scale distributed systems , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[4]  Israel Cidon,et al.  Optimal Content Location in Multicast Based Overlay Networks with Content Updates , 2004, World Wide Web.

[5]  Kavitha Ranganathan,et al.  Identifying Dynamic Replication Strategies for a High-Performance Data Grid , 2001, GRID.

[6]  Reagan Moore,et al.  Data-intensive computing , 1998 .

[7]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[8]  Floriano Zini,et al.  Evaluation of an economy-based file replication strategy for a data grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[9]  Myung M. Bae,et al.  Resource placement in torus-based networks , 1996, Proceedings of International Conference on Parallel Processing.

[10]  Ouri Wolfson,et al.  The multicast policy and its relationship to replicated data placement , 1991, TODS.

[11]  Kavitha Ranganathan,et al.  Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[12]  Jemal H. Abawajy,et al.  An efficient replicated data access approach for large-scale distributed systems , 2004, CCGRID.

[13]  Kurt Stockinger,et al.  Simulation of Dynamic Grid Replication Strategies in OptorSim , 2002, GRID.

[14]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[15]  Jemal H. Abawajy,et al.  Placement of File Replicas in Data Grid Environments , 2004, International Conference on Computational Science.

[16]  Konstantinos Kalpakis,et al.  Optimal Placement of Replicas in Trees with Read, Write, and Storage Costs , 2001, IEEE Trans. Parallel Distributed Syst..

[17]  Jianliang Xu,et al.  QoS-aware replica placement for content distribution , 2005, IEEE Transactions on Parallel and Distributed Systems.

[18]  Samir Khuller,et al.  Capacitated vertex covering , 2003, J. Algorithms.

[19]  Nian-Feng Tzeng,et al.  Resource Allocation in Cube Network Systems Based on the Covering Radius , 1996, IEEE Trans. Parallel Distributed Syst..

[20]  Deying Li,et al.  Placement of Web-Server Proxies with Consideration of Read and Update Operations on the Internet , 2003, Comput. J..

[21]  Carl Kesselman,et al.  Wide area data replication for scientific collaborations , 2005, Int. J. High Perform. Comput. Netw..

[22]  Rajmohan Rajaraman,et al.  Placement Algorithms for Hierarchical Cooperative Caching , 2001, J. Algorithms.

[23]  Brian Tierney,et al.  File and Object Replication in Data Grids , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[24]  E. Deelman,et al.  Data replication strategies in grid environments , 2002, Fifth International Conference on Algorithms and Architectures for Parallel Processing, 2002. Proceedings..