I-Cache Tag Reduction for Low Power Chip Multiprocessor

Energy consumption is a major consideration in microprocessor optimization. This paper presents a tag-reduction based approach for energy saving in L1 I-Cache (instruction cache) of Chip Multiprocessors (CMP). To our best knowledge, this is the first work that extends the tag reduction technique to the CMP. We formulate our approach to an equivalent problem which is to find an assignment of the whole instruction pages in the physical memory to a set of cores such that the tag-reduction conflicts for each core can be mostly avoided or reduced. We then propose three algorithms using different heuristics for this assignment problem. The experimental results show that our proposed algorithms can save the total power up to 45.33% in average compared to the one that the tag-reduction is not used. They outperform significantly the tag-reduction based algorithm on single-core processor as well.

[1]  R. Stephany,et al.  A 200MHz 32b 0.5W CMOS RISC Microprocessor , 1998 .

[2]  Peter Petrov,et al.  Low-power data memory communication for application-specific embedded processors , 2002, 15th International Symposium on System Synthesis, 2002..

[3]  Dean M. Tullsen,et al.  Editorial: Special Section on CMP Architectures , 2007, IEEE Trans. Parallel Distributed Syst..

[4]  Xiangrong Zhou,et al.  Heterogeneously tagged caches for low-power embedded systems with virtual memory support , 2008, TODE.

[5]  Ulrich Kremer,et al.  The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction , 2003, PLDI '03.

[6]  Peter Petrov,et al.  Dynamic Tag Reduction for Low-Power Caches in Embedded Systems with Virtual Memory , 2006, International Journal of Parallel Programming.

[7]  Richard E. Kessler,et al.  Inexpensive Implementations Of Set-Associativity , 1989, The 16th Annual International Symposium on Computer Architecture.

[8]  Ruben W. Castelino,et al.  Internal Organization of the Alpha 21164, a 300-MHz 64-bit Quad-issue CMOS RISC Microprocessor , 1995, Digit. Tech. J..

[9]  Peter Petrov,et al.  Virtual page tag reduction for low-power TLBs , 2003, Proceedings 21st International Conference on Computer Design.

[10]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[11]  Ying Chen,et al.  Minimizing Energy via Loop Scheduling and DVS for Multi-Core Embedded Systems , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).

[12]  Jason Cong,et al.  Accelerating Sequential Applications on CMPs Using Core Spilling , 2007, IEEE Transactions on Parallel and Distributed Systems.

[13]  Jun Shirako,et al.  Compiler Control Power Saving Scheme for Multi Core Processors , 2005, LCPC.

[14]  Peter Petrov,et al.  Tag compression for low power in dynamically customizable embedded processors , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.