Experiment flows and microbenchmarks for reverse engineering of branch predictor structures

Insights into branch predictor organization and operation can be used in architecture-aware compiler optimizations to improve program performance. Unfortunately, such details are rarely publicly disclosed. In this paper we introduce a set of experiment flows and corresponding microbenchmarks for reverse engineering cache-like branch target and outcome predictor structures, indexed by branch address or program path information. The experiment flows are demonstrated on the Intel Pentium M branch predictor. We have been able to determine the size, organization, internal operation, and interactions between various hardware structures used in the Pentium M branch predictor, namely the branch target buffer, indirect branch target buffer, loop branch predictor buffer, global predictor, and bimodal predictor. These findings have been validated using a functional PIN model.

[1]  Trevor N. Mudge,et al.  The YAGS branch prediction scheme , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[2]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[3]  Vladimir Uzelac,et al.  MICROBENCHMARKS AND MECHANISMS FOR REVERSE ENGINEERING OF MODERN BRANCH PREDICTOR UNITS , 2008 .

[4]  Karel Driesen,et al.  Accurate indirect branch prediction , 1998, ISCA.

[5]  James E. Smith,et al.  A study of branch prediction strategies , 1981, ISCA '98.

[6]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[7]  Yale N. Patt,et al.  Target prediction for indirect jumps , 1997, ISCA '97.

[8]  R. D. Valentine,et al.  The Intel Pentium M processor: Microarchitecture and performance , 2003 .

[9]  Alan Jay Smith,et al.  Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.

[10]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor architecture , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).

[11]  Simcha Gochman,et al.  Introduction to Intel Core Duo Processor Architecture , 2006 .

[12]  Daniel A. Jiménez,et al.  Code placement for improving dynamic branch prediction accuracy , 2005, PLDI '05.

[13]  Daniel A. Jiménez,et al.  Fast Path-Based Neural Branch Prediction , 2003, MICRO.

[14]  Yale N. Patt,et al.  The agree predictor: a mechanism for reducing negative branch history interference , 1997, ISCA '97.

[15]  Yale N. Patt,et al.  Alternative Implementations of Two-Level Adaptive Branch Prediction , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[16]  Yiannakis Sazeides,et al.  Design tradeoffs for the Alpha EV8 conditional branch predictor , 2002, ISCA.

[17]  André Seznec The O-GEHL branch predictor , 2004 .

[18]  Aleksandar Milenkovic,et al.  Microbenchmarks for determining branch predictor organization , 2004, Softw. Pract. Exp..

[19]  A. Seznec,et al.  Trading Conflict And Capacity Aliasing In Conditional Branch Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[20]  André Seznec,et al.  The L-TAGE Branch Predictor , 2007, J. Instr. Level Parallelism.

[21]  Daniel A. Jiménez,et al.  Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[22]  Joseph T. Rahmeh,et al.  Improving the accuracy of dynamic branch prediction using branch correlation , 1992, ASPLOS V.

[23]  Trevor N. Mudge,et al.  The bi-mode branch predictor , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.