The Pilot Way to Grid Resources Using glideinWMS

Grid computing has become very popular in big and widespread scientific communities with high computing demands, like high energy physics. Computing resources are being distributed over many independent sites with only a thin layer of Grid middleware shared between them. This deployment model has proven to be very convenient for computing resource providers, but has introduced several problems for the users of the system, the three major being the complexity of job scheduling, the non-uniformity of compute resources, and the lack of good job monitoring.Pilot jobs address all the above problems by creating a virtual private computing pool on top of Grid resources. This paper presents both the general pilot concept, as well as a concrete implementation, called glideinWMS, deployed in the Open Science Grid.

[1]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[2]  Se-Chang Son,et al.  Current methods for negotiating firewalls for the Condor system , 2005 .

[3]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[4]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[5]  Jorge Luis Rodriguez,et al.  The Open Science Grid , 2005 .

[6]  Igor Sfiligoi,et al.  Addressing the pilot security problem with gLExec , 2008 .