Theoretical modeling of superscalar processor performance

The current trace-driven simulation approach to determine superscalar processor performance is widely used but has some shortcomings. Modern benchmarks generate extremely long traces, resulting in problems with data storage, as well as very long simulation run times. More fundamentally, simulation generally does not provide significant insight into the factors that determine performance or a characterization of their interactions. This paper proposes a theoretical model of superscalar processor performance that addresses these shortcomings. Performance is viewed as an interaction of program parallelism and machine parallelism. Both program and machine parallelisms are decomposed into multiple component functions. Methods for measuring or computing these functions are described. The functions are combined to provide a model of the interaction between program and machine parallelisms and an accurate estimate of the performance. The computed performance, based on this model, is compared to simulated performance for six benchmarks from the SPEC 92 suite on several configurations of the IBM RS/6000 instruction set architecture.