A measurement and simulation methodology for parallel computing performance studies

Disciplined application of system measurement and performance simulation is a powerful method for understanding the behavior of parallel programs and computers. The current state-of-the-art in parallel program performance analysis is focused on interconnection network and processor performance. The presence of operating system interference, although recognized as a source of performance degradation, has not been formally considered in the analysis process. Distributed memory parallel programs often rely on periods of local computation that take the same amount of time to complete before synchronous communications are used to exchange data between processors. When the amount of time varies between processors, those that execute fastest are left idle while others catch up. This is particularly damaging to performance when the operations require a global synchronization where one slow processor can induce wasted, idle time on all others. The hypothesis of this work is that a disciplined method of interference measurement coupled with a simulation of its effect on parallel programs can enable performance analysts to consider interference in their diagnosis and tuning process. This dissertation provides the following new results in this topic area that demonstrate the viability of this methodology: (1) The design, implementation, and application of the fixed time quantum microbenchmark for quantifying operating system perturbations on parallel computers is provided, representing the first detailed implementation and analysis scheme for such measurements. (2) A trace-driven simulation of distributed memory message passing programs is presented. This contributes to the field of performance analysis the inclusion of operating system perturbations as a parameter for performance simulation. (3) Performance sensitivity studies for several real-world parallel applications are given using the microbenchmarking and simulation tools. This work shows that quantification of interference is possible through carefully constructed microbenchmarks, one of which has been demonstrated and is now in use by researchers in the field. The simulation tools developed to analyze the effect of this noise have extended the capability of existing analysis tools to integrate this new performance affecting parameter. Finally, this work demonstrates that the measurement and simulation methods can be applied to real codes to reason about their performance sensitivity.