A batch system for HEP applications on a distributed IaaS cloud

The emergence of academic and commercial Infrastructure-as-a-Service (IaaS) clouds is opening access to new resources for the HEP community. In this paper we will describe a system we have developed for creating a single dynamic batch environment spanning multiple IaaS clouds of different types (e.g. Nimbus, OpenNebula, Amazon EC2). A HEP user interacting with the system submits a job description file with a pointer to their VM image. VM images can either be created by users directly or provided to the users. We have created a new software component called Cloud Scheduler that detects waiting jobs and boots the user VM required on any one of the available cloud resources. As the user VMs appear, they are attached to the job queues of a central Condor job scheduler, the job scheduler then submits the jobs to the VMs. The number of VMs available to the user is expanded and contracted dynamically depending on the number of user jobs. We present the motivation and design of the system with particular emphasis on Cloud Scheduler. We show that the system provides the ability to exploit academic and commercial cloud sites in a transparent fashion.