Wormhole: Wisely Predicting Multidimensional Branches

Improving branch prediction accuracy is essential in enabling high-performance processors to find more concurrency and to improve energy efficiency by reducing wrong path instruction execution, a paramount concern in today's power-constrained computing landscape. Branch prediction traditionally considers past branch outcomes as a linear, continuous bit stream through which it searches for patterns and correlations. The state-of-the-art TAGE predictor and its variants follow this approach while varying the length of the global history fragments they consider. This work identifies a construct, inherent to several applications that challenges existing, linear history based branch prediction strategies. It finds that applications have branches that exhibit multi-dimensional correlations. These are branches with the following two attributes: 1) they are enclosed within nested loops, and 2) they exhibit correlation across iterations of the outer loops. Folding the branch history and interpreting it as a multidimensional piece of information, exposes these cross-iteration correlations allowing predictors to search for more complex correlations in the history space with lower cost. We present wormhole, a new side-predictor that exploits these multidimensional histories. Wormhole is integrated alongside ISL-TAGE and leverages information from its existing side-predictors. Experiments show that the wormhole predictor improves accuracy more than existing side-predictors, some of which are commercially available, with a similar hardware cost. Considering 40 diverse application traces, the wormhole predictor reduces MPKI by an average of 2.53% and 3.15% on top of 4KB and 32KB ISL-TAGE predictors respectively. When considering the top four workloads that exhibit multi-dimensional history correlations, Wormhole achieves 22% and 20% MPKI average reductions over 4KB and 32KB ISL-TAGE.

[1]  Yi Ma,et al.  Address-branch correlation: A novel locality for long-latency hard-to-predict branches , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[2]  Y.N. Patt,et al.  Using Hybrid Branch Predictors to Improve Branch Prediction Accuracy in the Presence of Context Switches , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[3]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[4]  Huiyang Zhou,et al.  Adaptive Information Processing: An Effective Way to Improve Perceptron Predictors , 2005, J. Instr. Level Parallelism.

[5]  Pierre Michaud,et al.  Pushing the branch predictability limits with the multi-poTAGE+SC predictor , 2014 .

[6]  Daniel A. Jiménez OH-SNAP : Optimized Hybrid Scaled Neural Analog Predictor , 2011 .

[7]  S. McFarling Combining Branch Predictors , 1993 .

[8]  James E. Smith,et al.  A study of branch prediction strategies , 1981, ISCA '98.

[9]  Joseph T. Rahmeh,et al.  Improving the accuracy of dynamic branch prediction using branch correlation , 1992, ASPLOS V.

[10]  A. Seznec,et al.  Trading Conflict And Capacity Aliasing In Conditional Branch Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[11]  Trevor N. Mudge,et al.  The YAGS branch prediction scheme , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[12]  Yiannakis Sazeides,et al.  Design tradeoffs for the Alpha EV8 conditional branch predictor , 2002, ISCA.

[13]  Pierre Michaud,et al.  A case for (partially) TAgged GEometric history length branch prediction , 2006, J. Instr. Level Parallelism.

[14]  André Seznec TAGE-SC-L Branch Predictors , 2014 .

[15]  André Seznec A 64 Kbytes ISL-TAGE branch predictor , 2011 .

[16]  Andreas Moshovos,et al.  Wormhole branch prediction using multi-dimensional histories , 2014 .

[17]  Daniel A. Jiménez,et al.  Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[18]  André Seznec,et al.  A new case for the TAGE branch predictor , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  André Seznec,et al.  The L-TAGE Branch Predictor , 2007, J. Instr. Level Parallelism.

[20]  Yale N. Patt,et al.  A two-level approach to making class predictions , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[21]  Yale N. Patt,et al.  The agree predictor: a mechanism for reducing negative branch history interference , 1997, ISCA '97.

[22]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[23]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[24]  Zhe Wang,et al.  Studying microarchitectural structures with object code reordering , 2009, WBIA '09.