Job scheduling on the Earth Simulator

The Earth Simulator is a distributed-memory parallel system with a peak performance of 40TFLOPS. The system consists of 640 nodes that are connected via a fast 640 x 640 single-stage crossbar network. Because the system is very large, efficient job control is addressed as one of the most important issues in the system development. There are many scheduling strategies used on large-scale parallel computers to achieve efficient job control. The job scheduler for the Earth Simulator adopts elapsed time scheduling policy. This paper discusses the job scheduler for the Earth Simulator and Software Job Simulator developed to evaluate the job scheduling performance. The simulation results by the Software Job Simulator indicated that the efficiency of node usage on the Earth Simulator was about 62.5%.