Compiler-directed code restructuring for reducing data TLB energy

Prior work on TLB power optimization considered circuit and architectural techniques. A recent software-based technique for data TLBs has considered the possibility of storing the frequently used virtual-to-physical address translations in a set of translation registers (TRs), and using them when necessary instead of going to the data TLB. This work presents a compiler-based strategy for increasing the effectiveness of TRs. The idea is to restructure the application code in such a fashion that once a TR is loaded, its contents are reused as much as possible. Our experimental evaluation with six array-based benchmarks from the Spec2000 suite indicates that the proposed TR reuse strategy brings significant reductions in data TLB energy over an alternate strategy that employs TRs but does not restructure the code for TR reuse.

[1]  Mahmut T. Kandemir,et al.  Compiler-directed physical address generation for reducing dTLB power , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.

[2]  Monica S. Lam,et al.  Multiprocessors from a software perspective , 1996, IEEE Micro.

[3]  Norman P. Jouppi,et al.  CACTI 2.0: An Integrated Cache Timing and Power Model , 2002 .

[4]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[5]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[6]  Randy H. Katz,et al.  Eliminating the address translation bottleneck for physical address cache , 1992, ASPLOS V.

[7]  Doug Burger,et al.  Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .

[8]  Hsien-Hsin S. Lee,et al.  Energy efficient D-TLB and data cache using semantic-aware multilateral partitioning , 2003, ISLPED '03.

[9]  Lawrence T. Clark,et al.  Reducing translation lookaside buffer active power , 2003, ISLPED '03.

[10]  Seh-Woong Jeong,et al.  A Low Power TLB Structure for Embedded Systems , 2002, IEEE Computer Architecture Letters.

[11]  Shin-Dug Kim,et al.  A selective filter-bank TLB system , 2003, ISLPED '03.

[12]  Monica S. Lam,et al.  The Multiprocessor as a General-Purpose Processor: A Software Perspective , 1996 .

[13]  Tomás Lang,et al.  Reducing TLB power requirements , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[14]  Srilatha Manne Low Power TLB Design for High Performance Microprocessors , 1997 .

[15]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .