Impact of Reservations on Production Job Scheduling

The TeraGrid is a closely linked community of diverse resources: computational, data, and experimental, e.g., the imminent very large computational system at the University of Texas, the extensive data facilities at SDSC, and the physics experiments at ORNL. As research efforts become more extensive in scope, the co-scheduling of multiple resources becomes an essential part of scientific progress. This can be at odds with the traditional management of the computational systems, where utilization, queue wait times, and expansion factors are considered paramount and anything that affects their performance is considered with suspicion. The only way to assuage concerns is with intensive investigation of the likely effects of allowing advance reservations on these performance metrics. To understand the impact, we developed a simulator that reads our actual production job log and reservation request data to investigate different scheduling scenarios. We explored the effect of reservations and policies using job log data from two different months within consecutive years and present our initial results. Results from the simulations suggest that utilization, expansion factor and queue wait time indeed can be affected negatively by significant numbers and size of reservations, but this effect can be mitigated with appropriate policies.