PHFS: A dynamic replication method, to decrease access latency in the multi-tier data grid

Abstract Data replication is a method to improve the performance of data access in distributed systems. Dynamic replication is a kind of replication that adapts replication configuration with the change of users’ behavior during the time to ensure the benefits of replication. In this paper, we propose a new dynamic replication method in a multi-tier data grid called predictive hierarchical fast spread (PHFS) which is an extended version of fast spread (a dynamic replication method in the data grid). Considering spatial locality, PHFS tries to predict future needs and pre-replicates them in hierarchal manner to increase locality in accesses and consequently improves performance. In this paper, we compare PHFS and CFS (common fast spread) with an example from the perspective of access latency. The results show that PHFS causes lower latency and better performance in comparison with CFS.

[1]  Boleslaw K. Szymanski,et al.  Simulation of dynamic data replication strategies in Data Grids , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[2]  Bu-Sung Lee,et al.  A model to predict the optimal performance of the Hierarchical Data Grid , 2010, Future Gener. Comput. Syst..

[3]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[4]  Ruay-Shiung Chang,et al.  A dynamic data replication strategy using access-weights in data grids , 2008, The Journal of Supercomputing.

[5]  Ming Tang,et al.  Dynamic replication algorithms for the multi-tier Data Grid , 2005, Future Gener. Comput. Syst..

[6]  Won-Sik Yoon,et al.  Dynamic Data Grid Replication Strategy Based on Internet Hierarchy , 2003, GCC.

[7]  Peter Z. Kunszt,et al.  File-based replica management , 2005, Future Gener. Comput. Syst..

[8]  Jie Xu,et al.  On Dynamic Replication Strategies in Data Service Grids , 2008, 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC).

[9]  Atakan Dogan,et al.  A study on performance of dynamic file replication algorithms for real-time file access in Data Grids , 2009, Future Gener. Comput. Syst..

[10]  Floriano Zini,et al.  Analysis of Scheduling and Replica Optimisation Strategies for Data Grids Using OptorSim , 2004, Journal of Grid Computing.

[11]  Jemal H. Abawajy,et al.  An efficient replicated data access approach for large-scale distributed systems , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[12]  Ming Tang,et al.  The impact of data replication on job scheduling performance in the Data Grid , 2006, Future Gener. Comput. Syst..

[13]  Thomas M. Kroeger,et al.  Predicting file system actions from prior events , 1996 .

[14]  Suhaidi Hassan,et al.  Dynamic Replication Algorithm in Data Grid: Survey , 2008 .

[15]  Bostjan Slivnik,et al.  The complexity of static data replication in data grids , 2005, Parallel Comput..

[16]  Jesús Carretero,et al.  Branch replication scheme: A new model for data replication in large scale data grids , 2010, Future Gener. Comput. Syst..

[17]  Wu-chun Feng,et al.  Automatic Flow-Control Adaptation for Enhancing Network Performance in Computational Grids , 2003, Journal of Grid Computing.

[18]  Boleslaw K. Szymanski,et al.  Decentralized data management framework for data grids , 2007 .

[19]  Kavitha Ranganathan,et al.  Simulation Studies of Computation and Data Scheduling Algorithms for Data Grids , 2003, Journal of Grid Computing.

[20]  Kavitha Ranganathan,et al.  Identifying Dynamic Replication Strategies for a High-Performance Data Grid , 2001, GRID.

[21]  Darrell D. E. Long,et al.  Design and Implementation of a Predictive File Prefetching Algorithm , 2001, USENIX Annual Technical Conference, General Track.

[22]  Xiaoyan Hong,et al.  An on-line replication strategy to increase availability in Data Grids , 2008, Future Gener. Comput. Syst..