Response Time Optimization for Replica Selection Service in Data Grids

Problem Statement: Data Grid architecture provides a scalable infrastructure for grid services in order to manage data files and their corresponding replicas that were distributed across the globe. The grid services are designed to support a variety of data grid applications (jobs) and projects. Replica selection is a high-level service that chooses a replica location from among many distributed replicas with the minimum response time for the users' jobs. Estimating the response time accurately in the grid environment is not an easy task. The current systems expose high response time in selecting the required replicas because the response time is estimated by considering the data transfer time only. Approach: We proposed a replica selection system that selects the best replica location for the users' running jobs in a minimum response time that can be estimated by considering new factors besides the data transfer time, namely, the storage access latency and the replica requests that waiting in the storage queue. Results: The performance of the proposed system was compared with a similar system that exists in the literature namely, SimpleOptimiser. The simulation results demonstrated that our system performed better than the SimpleOptimiser on an average of 6%. Conclusions: The proposed system can select the best replica location in a lesser response time than the SimpleOptimise. The efficiency of the proposed system is 6% higher than the SimpleOptimise. The efficiency level has a high impact on the quality of service that is perceived by grid users in a data grid environment where the data files are relatively big. For example, the data files produced from the scientific applications are of the size hundreds of Terabytes.

[1]  Jennifer M. Schopf,et al.  Using Regression Techniques to Predict Large Data Transfers , 2003, Int. J. High Perform. Comput. Appl..

[2]  Kurt Stockinger,et al.  OptorSim-A Grid Simulator for Studying Dynamic Data Replication Strategies , 2003 .

[3]  Jun Feng,et al.  Eliminating replica selection - using multiple replicas to accelerate data transfer on grids , 2004, Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004..

[4]  ChangHoon Lee,et al.  Sized-Based Replacement-k Replacement Policy in Data Grid Environments , 2006, ISPA.

[5]  Floriano Zini,et al.  Analysis of Scheduling and Replica Optimisation Strategies for Data Grids Using OptorSim , 2004, Journal of Grid Computing.

[6]  David Abramson,et al.  The GriddLeS data replication service , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[7]  Fang Hao,et al.  A probe-based server selection protocol for differentiated service networks , 2002, 2002 IEEE International Conference on Communications. Conference Proceedings. ICC 2002 (Cat. No.02CH37333).

[8]  Per Brinch Hansen,et al.  Operating System Principles , 1973 .

[9]  Rajkumar Buyya,et al.  A taxonomy of computer‐based simulations and its mapping to parallel and distributed systems simulation tools , 2004, Softw. Pract. Exp..

[10]  Reda Alhajj,et al.  Replica selection in grid environment: a data-mining approach , 2005, SAC '05.

[11]  Peter Scheuermann,et al.  Content replication in Web++ , 2003, Second IEEE International Symposium on Network Computing and Applications, 2003. NCA 2003..

[12]  Heon Young Yeom,et al.  ReCon: A Fast and Reliable Replica Retrieval Service for the Data Grid , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[13]  Ian T. Foster,et al.  Replica selection in the Globus Data Grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[14]  M. Makpangou,et al.  A scalable replica selection strategy based on flexible contracts , 2003, Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003.

[15]  Do-Hyeon Kim,et al.  Design and Implementation of Integrated Information System for Monitoring Resources in Grid Computing , 2006, 2006 10th International Conference on Computer Supported Cooperative Work in Design.

[16]  Yu Hu,et al.  GRESS - a Grid Replica Selection Service , 2003, ISCA PDCS.

[17]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[18]  Kavitha Ranganathan,et al.  Identifying Dynamic Replication Strategies for a High-Performance Data Grid , 2001, GRID.

[19]  Floriano Zini,et al.  Evaluation of an economy-based file replication strategy for a data grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..