Exploiting Intra-function Correlation with the Global History Stack

The demand for more computation power in high-end embedded systems has put embedded processors on parallel evolution track as the RISC processors. Caches and deeper pipelines are standard features on recent embedded microprocessors. As a result of this, some of the performance penalties associated with branch instructions in RISC processors are becoming more prevalent in these processors. As is the case in RISC architectures, designers have turned to dynamic branch prediction to alleviate this problem. Global correlating branch predictors take advantage of the influence past branches have on future ones. The conditional branch outcomes are recorded in a global history register (GHR). Based on the hypothesis that most correlation is among intra-function branches, we provide a detailed analysis of the Global History Stack (GHS) in this paper. The GHS saves the global history in the return address stack when a call instruction is executed. Following the subsequent return, the history is restored from the stack. In addition, to preserve the correlation between the callee branches and the caller branches following the call instruction, we save a few of the history bits coming from the end of the callee's execution. We also investigate saving the GHR of a function in the Branch Target Buffer (BTB) when it returns so that it can be restored when that function is called again. Our results show that these techniques improve the accuracy of several global history based prediction schemes by 4% on average. Consequently, performance improvements as high as 13% are attained.

[1]  S. McFarling Combining Branch Predictors , 1993 .

[2]  Trevor N. Mudge,et al.  The YAGS branch prediction scheme , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[3]  James E. Smith,et al.  Path-based next trace prediction , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[4]  D. B. Davis,et al.  Intel Corp. , 1993 .

[5]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[6]  Pierre Michaud,et al.  De-aliased Hybrid Branch Predictors , 1999 .

[7]  Yale N. Patt,et al.  Variable length path branch prediction , 1998, ASPLOS VIII.

[8]  Anne Rogers,et al.  The performance impact of incomplete bypassing in processor pipelines , 1995, MICRO 1995.

[9]  A. Seznec,et al.  Trading Conflict And Capacity Aliasing In Conditional Branch Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[10]  Yale N. Patt,et al.  A two-level approach to making class predictions , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[11]  Ravi Nair,et al.  Dynamic path-based branch correlation , 1995, MICRO 28.

[12]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[13]  Yale N. Patt,et al.  The agree predictor: a mechanism for reducing negative branch history interference , 1997, ISCA '97.

[14]  Chris Wilkerson,et al.  Improving branch prediction by dynamic dataflow-based identification of correlated branches from a large global history , 2003, ISCA '03.

[15]  Dirk Grunwald,et al.  Quantifying Behavioral Differences Between C and C++ Programs , 1994 .

[16]  Yale N. Patt,et al.  An analysis of correlation and predictability: what makes two-level branch predictors work , 1998, ISCA.

[17]  Yiannakis Sazeides,et al.  Design tradeoffs for the Alpha EV8 conditional branch predictor , 2002, ISCA.

[18]  Trevor N. Mudge,et al.  Correlation and Aliasing in Dynamic Branch Predictors , 1996, ISCA.

[19]  Trevor N. Mudge,et al.  The bi-mode branch predictor , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.