Improving branch prediction by dynamic dataflow-based identification of correlated branches from a large global history

Deep pipelines and fast clock rates are necessitating the development of high accuracy, multi-stage branch predictors for future processors. Such a predictor uses a collection of predictors, each of which provides its predictions at a different stage of the pipeline front-end. A simple 1-cycle latency line predictor provides predictions in the first stage, followed in a couple of stages later by predictions from a more accurate global predictor. Finally, one or two stages later, a highly accurate corrector predictor selectively corrects the global predictor's prediction. As the corrector predictor has the final say, its accuracy must be very high. The focus of this paper is to propose and evaluate techniques to build high-accuracy corrector predictors.Our techniques rely on using a long global history, and identifying correlated branches in this history by using runtime dataflow information. In particular, we identify for each dynamic branch a set of branches called "affectors", which control the computation that affect that branch's outcome. We propose efficient hardware structures to track dataflow and to identify the affector branches for each dynamic branch; the hardware overhead for identifying affectors for all dynamic branches from a 64 branch global history is only 312 bytes. We then propose two prediction schemes that put to use the affector branch information. Experimental studies show that adding an 8KB corrector predictor (that uses affector information) to a 16KB perceptron predictor (total size 24.2KB) reduces the average misprediction rate for 12 benchmarks from 6.3% to 5.7%, an improvement achieved only by a 64KB perceptron predictor.

[1]  Alvin M. Despain,et al.  The 16-fold way: a microparallel taxonomy , 1993, MICRO 1993.

[2]  T. Juan,et al.  Dynamic history-length fitting: a third level of adaptivity for branch prediction , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[3]  Karel Driesen,et al.  The cascaded predictor: economical and adaptive branch target prediction , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[4]  Brad Calder,et al.  Automated design of finite state machine predictors for customized processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[5]  Eric Sprangle,et al.  Increasing processor performance by implementing deeper pipelines , 2002, ISCA.

[6]  Pierre Michaud,et al.  Trading Conflict And Capacity Aliasing In Conditional Branch Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[7]  Yale N. Patt,et al.  The agree predictor: a mechanism for reducing negative branch history interference , 1997, ISCA '97.

[8]  Calvin Lin,et al.  Delay-sensitive branch predictors for future technologies , 2002 .

[9]  Yale N. Patt,et al.  An analysis of correlation and predictability: what makes two-level branch predictors work , 1998, ISCA.

[10]  Stamatis Vassiliadis,et al.  Register renaming and dynamic speculation: an alternative approach , 1993, MICRO.

[11]  Trevor N. Mudge,et al.  The bi-mode branch predictor , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[12]  Andreas Moshovos,et al.  Streamlining inter-operation memory communication via data dependence prediction , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[13]  Daniel A. Jiménez,et al.  Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[14]  Lei Chen,et al.  Dynamic data dependence tracking and its application to branch prediction , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[15]  Yale N. Patt,et al.  Variable length path branch prediction , 1998, ASPLOS VIII.

[16]  Trevor N. Mudge,et al.  The YAGS branch prediction scheme , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[17]  James E. Smith,et al.  Improving branch predictors by correlating on data values , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[18]  Guang R. Gao,et al.  Elastic history buffer: a low-cost method to improve branch prediction accuracy , 1997, Proceedings International Conference on Computer Design VLSI in Computers and Processors.

[19]  Daniel A. Jiménez,et al.  The impact of delay on the design of branch predictors , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.