A More Realistic Way of Stressing the End-to-end I/O System

Synthetic I/O benchmarks and tests are often insufficient in realistically stressing a complex end-to-end I/O path. Evaluations built solely around these benchmarks can help establish a high-level understanding of the system and save resources and time. However, they fail to identify subtle bugs and error conditions that can occur only when running at large-scale. The Oak Ridge Leadership Computing Facility (OLCF) recently started an effort to assess the I/O path more realistically and improve the evaluation methodology used for major and minor file system software upgrades. To this end, an I/O test harness was built using a combination of real-world scientific applications and synthetic benchmarks. The experience with the harness and the testing methodology introduced are presented in this paper. The more systematic testing performed with the harness resulted in successful upgrades of Lustre on OLCF systems and a more stable computational and analysis environment. Keywords-Parallel I/O; Lustre; testing; supercomputers