Entropy-based low power data TLB design

The Translation Look-aside Buffer (TLB), a content addressable memory, consumes significant power due to the associative search mechanism it uses in the virtual to physical address translation. Based on our analysis of the TLB accesses, we make two observations. First, the entropy or information content of the stack virtual page numbers is low due to high spatial locality of stack memory references. Second, the entropy of the higher order bits of global memory references is low since the size of the global data is determined and fixed during compilation of a program. Based on these two characteristics, we propose two techniques: an entropy-based speculative stack address TLB and a deterministic global address TLB to achieve energy reducing. Our results show an average of 47% energy savings in the data TLB with less than 1%overall performance impact.

[1]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[2]  Per Stenström,et al.  TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors , 2002, ISLPED '02.

[3]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[4]  Arvin Park,et al.  An analysis of the information content of address reference streams , 1991, MICRO 24.

[5]  Mahmut T. Kandemir,et al.  Compiler-directed code restructuring for reducing data TLB energy , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[6]  Edward S. Davidson,et al.  Information content of CPU memory referencing behavior , 1977, ISCA '77.

[7]  Guang R. Gao,et al.  An energy efficient TLB design methodology , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[8]  Tomás Lang,et al.  Reducing TLB power requirements , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[9]  Albert Y. Zomaya,et al.  An Efficient Parallel Prefix Sums Architecture with Domino Logic , 2003, IEEE Trans. Parallel Distributed Syst..

[10]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[11]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[12]  Trevor N. Mudge,et al.  Power: A First-Class Architectural Design Constraint , 2001, Computer.

[13]  Trevor Mudge Power: A First Class Design Constraint for Future Architecture and Automation , 2000, HiPC.

[14]  Vivek De,et al.  Life is CMOS: why chase the life after? , 2002, DAC '02.

[15]  Peter Petrov,et al.  Virtual page tag reduction for low-power TLBs , 2003, Proceedings 21st International Conference on Computer Design.

[16]  Hsien-Hsin S. Lee,et al.  Energy efficient D-TLB and data cache using semantic-aware multilateral partitioning , 2003, ISLPED '03.

[17]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[18]  Yu Cao,et al.  New paradigm of predictive MOSFET and interconnect modeling for early circuit simulation , 2000, Proceedings of the IEEE 2000 Custom Integrated Circuits Conference (Cat. No.00CH37044).