Near Threshold Last Level Cache for Energy Efficient Embedded Applications

State-of-the-art embedded processors find their use in several domains like vision-based and big data applications. Such applications require a huge amount of information per task, and thereby need frequent main memory accesses to perform the entire computation. In such a scenario, a bigger size last level cache (LLC) would improve the performance and throughput of the system by reducing the global miss rate and miss penalty to a large extent. But this would lead to increased power consumption due to the extended cache memory, which becomes more significant for battery-driven mobile devices. Near threshold operation of memory cells is considered as a notable solution in saving a substantial amount of energy for such applications. We propose a cache architecture that takes advantage of both near threshold and standard LLC operation to meet the required power and performance constraints. A controller unit is implemented to dynamically drive the LLC to operate at standard or near threshold operating region based on application specific operations. The controller can also power gate a portion of LLC to further reduce the leakage power. By simulating different MiBench benchmarks, we show that our proposed cache architecture can reduce average energy consumption by 22% with a minimal average runtime penalty of 2.5% over the baseline architecture with no cache reconfigurability.

[1]  David Blaauw,et al.  Reconfigurable energy efficient near threshold cache architectures , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[2]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[3]  Sujay Deb,et al.  P2NoC: Power- and Performance-aware NoC Architectures for Sustainable Computing , 2017, Sustain. Comput. Informatics Syst..

[4]  Wayne Luk,et al.  Computer System Design: System-on-Chip , 2011 .

[5]  Norman P. Jouppi,et al.  CACTI 6.0: A Tool to Model Large Caches , 2009 .

[6]  David Blaauw,et al.  Low-Power Near-Threshold Design: Techniques to Improve Energy Efficiency Energy-efficient near-threshold design has been proposed to increase energy efficiency across a wid , 2015, IEEE Solid-State Circuits Magazine.

[7]  Sujay Deb,et al.  An Efficient Hardware Implementation of DVFS in Multi-core System with Wireless Network-on-Chip , 2014, 2014 IEEE Computer Society Annual Symposium on VLSI.

[8]  Sujay Deb,et al.  Adaptive multi-voltage scaling in wireless NoC for high performance low power applications , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9]  David R. Kaeli,et al.  Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[10]  Frank Vahid,et al.  A self-tuning cache architecture for embedded systems , 2004 .

[11]  E. Seevinck,et al.  Static-noise margin analysis of MOS SRAM cells , 1987 .

[12]  Frank Vahid,et al.  A Self-Tuning Configurable Cache , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[13]  Babak Falsafi,et al.  Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[14]  Gang Chen,et al.  Abstract: Energy optimization for real-time multiprocessor system-on-chip with optimal DVFS and DPM combination , 2013, ESTImedia.

[15]  Sidhartha Sankar Rout,et al.  Dynamic NoC platform for varied application needs , 2018, 2018 19th International Symposium on Quality Electronic Design (ISQED).

[16]  Trevor Mudge,et al.  Yield-driven near-threshold SRAM design , 2007, ICCAD 2007.

[17]  Sujay Deb,et al.  Adaptive Multi-Voltage Scaling with Utilization Prediction for Energy-Efficient Wireless NoC , 2017, IEEE Transactions on Sustainable Computing.

[18]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[19]  Pedro C. Diniz,et al.  Run-time cache configuration for the LEON-3 embedded processor , 2015, 2015 28th Symposium on Integrated Circuits and Systems Design (SBCCI).

[20]  Timothy M. Jones,et al.  Smart cache: A self adaptive cache architecture for energy efficiency , 2011, 2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[21]  Nam Sung Kim,et al.  Low-voltage on-chip cache architecture using heterogeneous cell sizes for high-performance processors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[22]  David Blaauw,et al.  Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits , 2010, Proceedings of the IEEE.