Performance and scalability of a replica location service

We describe the implementation and evaluate the performance of a replica location service that is part of the Globus Toolkit Version 3.0. A replica location service (RLS) provides a mechanism for registering the existence of replicas and discovering them. Features of our implementation include the use of soft state update protocols to populate a distributed index and optional Bloom filter compression to reduce the size of these updates. Our results demonstrate that RLS performance scales well for individual servers with millions of entries and up to 100 requesting threads. We also show that the distributed RLS index scales well when using Bloom filter compression for wide area updates.

[1]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[2]  Carl Kesselman,et al.  GriPhyN and LIGO, building a virtual data Grid for gravitational wave scientists , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[3]  John S. Heidemann,et al.  Replication in Ficus distributed file systems , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[4]  Satoshi Matsuoka,et al.  Worldwide Fast File Replication on Grid Datafarm , 2003, ArXiv.

[5]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[6]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[7]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[8]  Adam Arbree,et al.  Mapping Abstract Complex Workflows onto Grid Environments , 2003, Journal of Grid Computing.

[9]  Marvin Theimer,et al.  The Case for Non-transparent Replication: Examples from Bayou , 1998, IEEE Data Eng. Bull..

[10]  Michael Stonebraker,et al.  Data replication in Mariposa , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[11]  Erwin Laure,et al.  Replica Management in Data Grids , 2002 .

[12]  Yolanda Gil,et al.  Pegasus: Planning for Execution in Grids , 2002 .

[13]  Ian Clarke,et al.  Protecting Free Expression Online with Freenet , 2002, IEEE Internet Comput..

[14]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[15]  Ben Y. Zhao,et al.  Tapestry: a resilient global-scale overlay for service deployment , 2004, IEEE Journal on Selected Areas in Communications.