Performance Evaluation of One and Two-Level Dynamic Branch Prediction Schemes over Comparable Hardware Costs

Branch prediction has become an area of interest due to its effects on the performance of pipelined and superscalar processors. Various methods have been proposed to speculate the path of an instruction stream after a branch. In this paper, the performance of prediction schemes is evaluated by not only the accuracy of prediction but also by the amount of hardware the technique requires to reach that level of accuracy. We model the configurations which were proposed by the authors of these schemes by allocating an equal number of bytes of memory to each and then mapping (if possible) the various tables needed by the scheme to that amount of memory. The total number of bytes per scheme was varied from 1 byte to 128 kilobytes for each of the different runs. The inputs to the various schemes were traces obtained by running the SPEC-92 benchmarks. We also compare the finite state machines proposed in [1] to update the history bits in the 2 bit schemes and study the performance of one over the other.