Evolution of the pilot infrastructure of CMS: towards a single glideinWMS pool

CMS production and analysis job submission is based largely on glideinWMS and pilot submissions. The transition from multiple different submission solutions like gLite WMS and HTCondor-based implementations was carried out over years and is coming now to a conclusion. The historically explained separate glideinWMS pools for different types of production jobs and analysis jobs are being unified into a single global pool. This enables CMS to benefit from global prioritization and scheduling possibilities. It also presents the sites with only one kind of pilots and eliminates the need of having to make scheduling decisions on the CE level. This paper provides an analysis of the benefits of a unified resource pool, as well as a description of the resulting global policy. It will explain the technical challenges moving forward and present solutions to some of them.

[1]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[2]  Cecchi Marco,et al.  The gLite workload management system , 2008 .

[3]  Igor Sfiligoi,et al.  The Pilot Way to Grid Resources Using glideinWMS , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[4]  Dorian Kcira,et al.  CMS computing operations during run 1 , 2014 .

[5]  J M Dost,et al.  glideinWMS experience with glexec , 2012 .

[6]  João Paulo Teixeira,et al.  The CMS experiment at the CERN LHC , 2008 .