The main goal of a Workload Management System (WMS) is to find and allocate resources for the given tasks. The more and better job information the WMS receives, the easier will be to accomplish its task, which directly translates into higher utilization of resources. Traditionally, the information associated with each job, like expected runtime, is defined beforehand by the Production Manager in best case and fixed arbitrary values by default. In the case of LHCb's Workload Management System no mechanisms are provided which automate the estimation of job requirements. As a result, much more CPU time is normally requested than actually needed. Particularly, in the context of multicore jobs this presents a major problem, since single- and multicore jobs shall share the same resources. Consequently, grid sites need to rely on estimations given by the VOs in order to not decrease the utilization of their worker nodes when making multicore job slots available. The main reason for going to multicore jobs is the reduction of the overall memory footprint. Therefore, it also needs to be studied how memory consumption of jobs can be estimated.A detailed workload analysis of past LHCb jobs is presented. It includes a study of job features and their correlation with runtime and memory consumption. Following the features, a supervised learning algorithm is developed based on a history based prediction. The aim is to learn over time how jobs' runtime and memory evolve influenced due to changes in experiment conditions and software versions. It will be shown that estimation can be notably improved if experiment conditions are taken into account.
[1]
Samir Cury Siqueira.
Event processing time prediction at the CMS Experiment of the Large Hadron Collider.
,
2013
.
[2]
Richard Gibbons,et al.
A Historical Application Profiler for Use by Parallel Schedulers
,
1997,
JSSPP.
[3]
Sally A. McKee,et al.
An Approach to Performance Prediction for Parallel Applications
,
2005,
Euro-Par.
[4]
Warren Smith,et al.
Predicting Application Run Times Using Historical Information
,
1998,
JSSPP.
[5]
Byoung-Dai Lee,et al.
Run-time prediction of parallel applications on shared environments
,
2003,
2003 Proceedings IEEE International Conference on Cluster Computing.
[6]
I Sfiligoi.
Estimating job runtime for CMS analysis jobs
,
2014
.
[7]
Nathalie Rauschmayr,et al.
Optimisation of LHCb Applications for Multi- and Manycore Job Submission
,
2014
.