Linked instruction caches for enhancing power efficiency of embedded systems

Abstract The power consumed by memory systems accounts for 45% of the total power consumed by an embedded system, and the power consumed during a memory access is 10 times higher than during a cache access. Thus, increasing the cache hit rate can effectively reduce the power consumption of the memory system and improve system performance. In this study, we increased the cache hit rate and reduced the cache-access power consumption by developing a new cache architecture known as a single linked cache (SLC) that stores frequently executed instructions. SLC has the features of low power consumption and low access delay, similar to a direct mapping cache, and a high cache hit rate similar to a two way-set associative cache by adding a new link field. In addition, we developed another design known as a multiple linked caches (MLC) to further reduce the power consumption during each cache access and avoid unnecessary cache accesses when the requested data is absent from the cache. In MLC, the linked cache is split into several small linked caches that store frequently executed instructions to reduce the power consumption during each access. To avoid unnecessary cache accesses when a requested instruction is not in the linked caches, the addresses of the frequently executed blocks are recorded in the branch target buffer (BTB). By consulting the BTB, a processor can access the memory to obtain the requested instruction directly if the instruction is not in the cache. In the simulation results, our method performed better than selective compression, traditional cache, and filter cache in terms of the cache hit rate, power consumption, and execution time.

[1]  Thambipillai Srikanthan,et al.  Custom instruction filter cache synthesis for low-power embedded systems , 2005, 16th IEEE International Workshop on Rapid System Prototyping (RSP'05).

[2]  S. Gurunarayanan,et al.  Predictive Placement Scheme In Set-Associative Cache For Energy Efficient Embedded Systems , 2008, 2008 International Conference on Signal Processing, Communications and Networking.

[3]  Chuanjun Zhang An efficient direct mapped instruction cache for application-specific embedded systems , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[4]  Cheol Hong Kim,et al.  An Accurate and Energy-Efficient Way Determination Technique for Instruction Caches by Early Tab Matching , 2008, 4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008).

[5]  David A. Wood,et al.  Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches , 2004 .

[6]  Jang-Soo Lee,et al.  An on-chip cache compression technique to reduce decompression overhead and design complexity , 2000, J. Syst. Archit..

[7]  Feng Pan,et al.  Exploring the energy-time tradeoff in MPI programs on a power-scalable cluster , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[8]  Jian Huang,et al.  Exploiting basic block value locality with block reuse , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[9]  Thambipillai Srikanthan,et al.  Dynamic filter cache for low power instruction memory hierarchy , 2004 .

[10]  Luca Benini,et al.  A class of code compression schemes for reducing power consumption in embedded microprocessor systems , 2004, IEEE Transactions on Computers.

[11]  Luca Benini,et al.  Cached-code compression for energy minimization in embedded processors , 2001, ISLPED '01.

[12]  Po-Yueh Chen,et al.  Bitmask-based code compression methods for balancing power consumption and code size for hard real-time embedded systems , 2012, Microprocess. Microsystems.

[13]  Ann Gordon-Ross,et al.  Lightweight runtime control flow analysis for adaptive loop caching , 2010, GLSVLSI '10.

[14]  Jun Yang,et al.  Frequent value locality and its applications , 2002, TECS.

[15]  Tajana Simunic,et al.  Energy estimation of peripheral devices in embedded systems , 2004, GLSVLSI '04.

[16]  Ching-Wen Chen,et al.  A tagless cache design for power saving in embedded systems , 2011, The Journal of Supercomputing.

[17]  Jun Yang,et al.  Frequent value compression in data caches , 2000, MICRO 33.

[18]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[19]  Yuan Xie,et al.  Code Compression for VLIW Embedded Systems Using a Self-Generating Table , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[20]  Kashif Ali,et al.  Modified Hotspot Cache Architecture: A Low Energy Fast Cache for Embedded Processors , 2006, 2006 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[21]  Chih-Hung Chang,et al.  A Low Power-Consuming Embedded System Design by Reducing Memory Access Frequencies , 2005, IEICE Trans. Inf. Syst..

[22]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[23]  Luca Benini,et al.  Selective instruction compression for memory energy reduction in embedded systems , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[24]  John Arends,et al.  Instruction fetch energy reduction using loop caches for embedded applications with small tight loops , 1999, ISLPED '99.

[25]  Jong-Myon Kim,et al.  Energy-aware instruction cache design using small trace cache , 2010, IET Comput. Digit. Tech..

[26]  Takao Onoye,et al.  An object code compression approach to embedded processors , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.