Dynamic Replica Selection Using Improved Kernel Density Estimation

Replication service in Distributed Systems can reduce access latency and bandwidth consumption. When different nodes hold replicas accessed, there will be a significant benefit by selecting the best replica. Most of the existed replication strategies deal with the prediction of the response time. However, these strategies do not take fully into account the network dynamic and access locality. To solve this problem, a dynamic replica selection strategy using improved Kernel Density Estimation (KDE) is presented. Firstly, it distinguishes old replicas from new ones. Then, it predicts the network load and available bandwidth to choose the best replica. The improved KDE can select accurately the best accessed replica with only a little history data, which is very useful in a dynamic network. Simulation results demonstrate the efficiency and effectiveness of improved KDE in comparison with other approaches.

[1]  Alok N. Choudhary,et al.  A distributed multi-storage resource architecture and I/O performance prediction for scientific computing , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[2]  Larry S. Davis,et al.  Improved fast gauss transform and efficient kernel density estimation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Jennifer M Schopf,et al.  IBL for Replica Selection in Data-Intensive Grid Applications , 2004 .

[4]  Richard Wolski,et al.  Dynamically forecasting network performance using the Network Weather Service , 1998, Cluster Computing.

[5]  Jennifer M. Schopf,et al.  Using Regression Techniques to Predict Large Data Transfers , 2003, Int. J. High Perform. Comput. Appl..

[6]  Hua Chen,et al.  Distributed Density Estimation Using Non-parametric Statistics , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[7]  Yu Hu,et al.  GRESS - a Grid Replica Selection Service , 2003, ISCA PDCS.

[8]  Reda Alhajj,et al.  Replica selection strategies in data grid , 2008, J. Parallel Distributed Comput..