论文信息 - Mixing HTC and HPC Workloads with HTCondor and Slurm

Mixing HTC and HPC Workloads with HTCondor and Slurm

Traditionally, the RHIC/ATLAS Computing Facility (RACF) at Brookhaven National Laboratory (BNL) has only maintained High Throughput Computing (HTC) resources for our HEP/NP user community. We've been using HTCondor as our batch system for many years, as this software is particularly well suited for managing HTC processor farm resources. Recently, the RACF has also begun to design/administrate some High Performance Computing (HPC) systems for a multidisciplinary user community at BNL. In this paper, we'll discuss our experiences using HTCondor and Slurm in an HPC context, and our facility's attempts to allow our HTC and HPC processing farms/clusters to make opportunistic use of each other's computing resources.

[1] Jakob Blomer,et al. Decentralized data storage and processing in the context of the LHC experiments at CERN , 2012 .

[2] William J. Dally,et al. Throughput computing , 2010, ICS '10.

[3] Douglas Thain,et al. Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..