Low-power branch target buffer for application-specific embedded processors

In this paper we present a methodology for a low-power branch identification mechanism, which enables the design of extremely power efficient branch predictors for embedded processors. The proposed technique utilizes application-specific information regarding the control-flow structure of the program major loops. Such information is used to completely eliminate the power hungry branch target buffer (BTB) lookups which normally occur at every execution cycle. Exact application knowledge regarding the control-flow structure of the program obviates the power expensive BTB operations, thus enabling the utilization of contemporary branch predictors in high-end, yet power-sensitive embedded processors. The utilization of exact application knowledge results not only in the complete elimination of the power hungry BTB structure but also in a perfect branch and target address identification. Cost-efficient and programmable hardware architecture for capturing the control-flow structure of the program is presented thereafter. The hardware complexity of the proposed architecture is carefully analyzed in terms of power, performance and area overhead. The proposed technique delivers power reductions in excess of 90% for a set of embedded benchmarks.

[1]  James R. Larus,et al.  Branch prediction for free , 1993, PLDI '93.

[2]  Chris H. Perleberg,et al.  Branch Target Buffer Design and Optimization , 1993, IEEE Trans. Computers.

[3]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[4]  Joseph A. Fisher,et al.  Predicting conditional branch directions from previous runs of a program , 1992, ASPLOS V.

[5]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[6]  S. McFarling Combining Branch Predictors , 1993 .

[7]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[8]  Tokyo,et al.  Proceedings. Euromicro Symposium on Digital System Design , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..

[9]  Joseph T. Rahmeh,et al.  Improving the accuracy of dynamic branch prediction using branch correlation , 1992, ASPLOS V.

[10]  John Arends,et al.  Low-cost branch folding for embedded applications with small tight loops , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[11]  Michael D. Smith,et al.  Static correlated branch prediction , 1999, TOPL.