A performance study of monitoring and information services for distributed systems

Monitoring and information services form a key component of a distributed system, or Grid. A quantitative study of such services can aid in understanding the performance limitations, advise in the deployment of the monitoring system, and help evaluate future development work. To this end, we study the performance of three monitoring and information services for distributed systems: the Globus Toolkit/spl reg/ Monitoring and Discovery Service (MDS2), the European Data Grid Relational Grid Monitoring Architecture (R-GMA) and Hawkeye, part of the Condor project. We perform experiments to test their scalability with respect to number of users, number of resources and amount of data collected. Our study shows that each approach has different behaviors, often due to their different design goals. In the four sets of experiments we conducted to evaluate the performance of the service components under different circumstances, we found a strong advantage to caching or pre-fetching the data, as well as the need to have primary components at well-connected sites because of the high load seen by all systems.

[1]  Rajesh Raman,et al.  Matchmaking frameworks for distributed resource management , 2000 .

[2]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[3]  Steve Fisher Relational model for information and monitoring , 2001 .

[4]  Peter A. Dinda,et al.  Key Concepts and Services of a Grid Information Service , 2002 .

[5]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[6]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[7]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[8]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[9]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[10]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[11]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[12]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[13]  Italo Epicoco,et al.  Analysis of the Globus Toolkit Grid Information Service , 2002 .

[14]  Tim Howes,et al.  Lightweight Directory Access Protocol , 1995, RFC.

[15]  Beth A. Plale,et al.  Whitepaper on Synthetic Workload for Grid Information Services/Registries , 2003 .

[16]  Ruth A. Aydt,et al.  A Grid Monitoring Architecture , 2002 .

[17]  Peter A. Dinda,et al.  Windows Performance Monitoring and Data Reduction Using WatchTower , 2001 .