Reducing misprediction penalty in the Branch Target Buffer

Ideal speedup in pipelined processors is seldom achieved due to stalls and breaks in the execution stream. These interrupts are caused by data and control hazards, the latter, however, can be the most detrimental to pipeline performance. Branch Target Buffer (BTB) can reduce performance penalty of branches in pipelined processors by predicting the path of the branch and caching information used by the branch. No stalls will be encountered if the branch entry is found in BTB and the prediction is correct; otherwise, the penalty will be at least two cycles. This paper proposes a novel algorithm based on changing the BTB structure to eliminate the branch misprediction penalty. It also highlights a problem in the previous BTB algorithms (nested branches problem) and proposes a solution to it.

[1]  D.R. Kaeli,et al.  Branch history table prediction of moving target branches due to subroutine returns , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.

[2]  Michael J. Flynn,et al.  Branch Strategies: Modeling and Optimization , 1991, IEEE Trans. Computers.

[3]  P. Petrov,et al.  Low-power branch target buffer for application-specific embedded processors , 2005 .

[4]  Maurício L. Pilla,et al.  Complex branch profiling for dynamic conditional execution , 2003, Proceedings. 15th Symposium on Computer Architecture and High Performance Computing.

[5]  Kai Hwang,et al.  Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.

[6]  Yi-Chang Chen,et al.  The study of reducing branch penalty by hardware , 1995, Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing.

[7]  Dirk Grunwald,et al.  Fast and accurate instruction fetch and branch prediction , 1994, ISCA '94.

[8]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[9]  Chris H. Perleberg,et al.  Branch Target Buffer Design and Optimization , 1993, IEEE Trans. Computers.

[10]  Moon Key Lee,et al.  An implementation of branch target buffer for high performance applications , 1995, 1995 IEEE TENCON. IEEE Region 10 International Conference on Microelectronics and VLSI. 'Asia-Pacific Microelectronics 2000'. Proceedings.