Power efficient branch prediction through early identification of branch addresses

Ever increasing performance requirements have elevated deeply pipelined architectures to a standard even in the embedded processor domain, requiring the incorporation of dynamic branch prediction subsystems to hide the execution latency of control-altering instructions. In this paper a low power early branch identification technique which enables the design of extremely power-efficient branch predictors and BTBs is proposed. Through static extraction of program information regarding the distance to subsequent branches, this technique enables the calculation of the next branch address as soon as the direction of the current branch has been predicted. Early identification of branch addresses enables a complete elimination of the power hungry BTB lookups normally occurring at every execution cycle, as well as a just-in-time wake-up mechanism when accessing "hibernating" entries in complex predictors, switched to power-saving mode to reduce leakage power dissipation. A cost-efficient Branch Identification Unit (BIU) to calculate branch addresses is presented and analyzed in terms of power and timing characteristics. The effectiveness of the proposed BTB access policy and predictor wake-up mechanism is also confirmed by the simulation results of the SPECint 2000 and Media-bench benchmarks.

[1]  P. Petrov,et al.  Low-power branch target buffer for application-specific embedded processors , 2005 .

[2]  S. McFarling Combining Branch Predictors , 1993 .

[3]  T. Mudge,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.

[4]  Kevin Skadron,et al.  Power issues related to branch prediction , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[5]  Joseph T. Rahmeh,et al.  Improving the accuracy of dynamic branch prediction using branch correlation , 1992, ASPLOS V.

[6]  James E. Smith,et al.  A study of branch prediction strategies , 1981, ISCA '98.

[7]  Margaret Martonosi,et al.  Applying decay strategies to branch predictors for leakage energy savings , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[8]  M.C. Huang,et al.  Branch prediction on demand: an energy-efficient solution [microprocessor architecture] , 2003, Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03..

[9]  Vittorio Zaccaria,et al.  Low-power branch prediction techniques for VLIW architectures: a compiler-hints based approach , 2005, Integr..

[10]  Andreas Moshovos,et al.  SEPAS: A highly accurate energy-efficient branch predictor , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[11]  Yale N. Patt,et al.  Alternative implementations of hybrid branch predictors , 1995, MICRO 1995.

[12]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[13]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[14]  Margaret Martonosi,et al.  Cache decay: exploiting generational behavior to reduce cache leakage power , 2001, ISCA 2001.

[15]  Yale N. Patt,et al.  An effective programmable prefetch engine for on-chip caches , 1995, MICRO 1995.

[16]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[17]  Andreas Moshovos,et al.  Branch predictor prediction: a power-aware branch predictor for high-performance processors , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.