In-network Monitoring and Control Policy for DVFS of CMP Networks-on-Chip and Last Level Caches

In chip design today and for a foreseeable future, on-chip communication is not only a performance bottleneck but also a substantial power consumer. This work focuses on employing dynamic voltage and frequency scaling (DVFS) policies for networks-on-chip (NoC) and shared, distributed last-level caches (LLC). In particular, we consider a practical system architecture where the distributed LLC and the NoC share a voltage/frequency domain which is separate from the core domain. This architecture enables controlling the relative speed between the cores and memory hierarchy without introducing synchronization delays within the NoC. DVFS for this architecture is more difficult than individual link/core-based DVFS since it involves spatially distributed monitoring and control. We propose an average memory access time (AMAT)-based monitoring technique and integrate it with DVFS based on PID control theory. Simulations on PARSEC benchmarks yield a 33% dynamic energy savings with a negligible impact on system performance.

[1]  Rajesh Kumar,et al.  A family of 45nm IA processors , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[2]  William J. Dally,et al.  Globally Adaptive Load-Balanced Routing on Tori , 2004, IEEE Computer Architecture Letters.

[3]  Chita R. Das,et al.  A low latency router supporting adaptivity for on-chip interconnects , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[4]  Li Shang,et al.  Power-efficient Interconnection Networks: Dynamic Voltage Scaling with Links , 2002, IEEE Computer Architecture Letters.

[5]  Stephen W. Keckler,et al.  Realistic Workload Characterization and Analysis for Networks-on-Chip Design , 2009 .

[6]  Chita R. Das,et al.  A case for dynamic frequency tuning in on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Karthikeyan Sankaralingam,et al.  On-Chip Interconnection Networks of the TRIPS Chip , 2007, IEEE Micro.

[8]  Radu Marculescu,et al.  Variation-adaptive feedback control for networks-on-chip with multiple clock domains , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[9]  William J. Dally,et al.  GOAL: a load-balanced adaptive routing algorithm for torus networks , 2003, ISCA '03.

[10]  Natalie D. Enright Jerger,et al.  DBAR: An efficient routing algorithm to support multiple concurrent applications in networks-on-chip , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[11]  Hong Liu,et al.  Energy proportional datacenter networks , 2010, ISCA.

[12]  Natalie D. Enright Jerger,et al.  Outstanding Research Problems in NoC Design: System, Microarchitecture, and Circuit Perspectives , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  Siamak Mohammadi,et al.  Low-energy GALS NoC with FIFO - Monitoring dynamic voltage scaling , 2011, Microelectron. J..

[14]  Diana Marculescu,et al.  Power efficiency of voltage scaling in multiple clock, multiple voltage cores , 2002, ICCAD 2002.

[15]  Stephen W. Keckler,et al.  Regional congestion awareness for load balance in networks-on-chip , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[16]  Massoud Pedram,et al.  Supervised Learning Based Power Management for Multicore Processors , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[18]  Antonio Visioli,et al.  Digital Control Engineering: Analysis and Design , 2009 .

[19]  Sandhya Dwarkadas,et al.  Dynamic frequency and voltage control for a multiple clock domain microarchitecture , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[20]  Paul V. Gratz,et al.  Ocin tsim-DVFS Aware Simulator for NoCs , 2009 .

[21]  Hannu Tenhunen,et al.  Autonomous DVFS on Supply Islands for Energy-Constrained NoC Communication , 2009, ARCS.

[22]  U. Weiser,et al.  Multiple clock and Voltage Domains for chip multi processors , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[23]  Stephen P. Boyd,et al.  Throughput-centric routing algorithm design , 2003, SPAA '03.

[24]  Hannu Tenhunen,et al.  Hierarchical Agent Architecture for Scalable NoC Design with Online Monitoring Services , 2008, MICRO-41 2008.

[25]  Mahmut T. Kandemir,et al.  Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling , 2007, The Journal of Supercomputing.

[26]  Margaret Martonosi,et al.  Formal online methods for voltage/frequency control in multiple clock domain microprocessors , 2004, ASPLOS XI.

[27]  Mahmut T. Kandemir,et al.  Integrated link/CPU voltage scaling for reducing energy consumption of parallel sparse matrix applications , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.