Data-oriented scheduling for PROOF
暂无分享,去创建一个
The Parallel ROOT Facility - PROOF - is a distributed analysis system optimized for I/O intensive analysis tasks of HEP data. With LHC entering the analysis phase, PROOF has become a natural ingredient for computing farms at Tier3 level. These analysis facilities will typically be used by a few tenths of users, and can also be federated into a sort of analysis cloud corresponding to the Virtual Organization of the experiment. Proper scheduling is required to guarantee fair resource usage, to enforce priority policies and to optimize the throughput. In this paper we discuss an advanced priority system that we are developing for PROOF. The system has been designed to automatically adapt to unknown length of the tasks, to take into account the data location and availability (including distribution across geographically separated sites), and the {group, user} default priorities. In this system, every element - user, group, dataset, job slot and storage - gets its priority and those priorities are dynamically linked with each other. In order to tune the interplay between the various components, we have designed and started implementing a simulation application that can model various type and size of PROOF clusters. In this application a monitoring package records all the changes of them so that we can easily understand and tune the performance. We will discuss the status of our simulation and show examples of the results we are expecting from it.
[1] Douglas Thain,et al. Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..