Cross-Layer Memory Management to Improve DRAM Energy Efficiency

Controlling the distribution and usage of memory power is often difficult, because these effects typically depend on activity across multiple layers of the vertical execution stack. To address this challenge, we construct a novel and collaborative framework that employs object placement, cross-layer communication, and page-level management to effectively distribute application objects in the DRAM hardware to achieve desired power/performance goals. This work describes the design and implementation of our framework, which is the first to integrate automatic object profiling and analysis at the application layer with fine-grained management of memory hardware resources in the operating system. We demonstrate the utility of this framework by employing it to control memory power consumption more effectively. First, we design a custom memory-intensive workload to show the potential of this approach to reduce DRAM energy consumption. Next, we develop sampling and profiling-based analyses and modify the code generator in the HotSpot VM to understand object usage patterns and automatically control the placement of hot and cold objects in a partitioned VM heap. This information is communicated to the operating system, which uses it to map the logical application pages to the appropriate DRAM modules according to user-defined provisioning goals. The evaluation shows that our Java VM-based framework achieves our goal of significant DRAM energy savings across a variety of workloads, without any source code modifications or recompilations.

[1]  Perry Cheng,et al.  The garbage collection advantage: improving program locality , 2004, OOPSLA.

[2]  Cliff Click,et al.  The Java HotSpot Server Compiler , 2001, Java Virtual Machine Research and Technology Symposium.

[3]  Chris Fallin,et al.  Memory power management via dynamic voltage/frequency scaling , 2011, ICAC '11.

[4]  Simon David Hammond,et al.  memkind: An Extensible Heap Memory Manager for Heterogeneous Memory Platforms and Mixed Memory Policies. , 2015 .

[5]  Christoforos E. Kozyrakis,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 335 Dune: Safe User-level Access to Privileged Cpu Features , 2022 .

[6]  Jeffrey S. Vetter,et al.  Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing , 2015, Computing in Science & Engineering.

[7]  Lieven Eeckhout,et al.  Statistically rigorous java performance evaluation , 2007, OOPSLA.

[8]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[9]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[10]  Martin Dimitrov,et al.  A framework for application guidance in virtual memory systems , 2013, VEE '13.

[11]  Karsten Schwan,et al.  Data tiering in heterogeneous memory systems , 2016, EuroSys.

[12]  Stephen W. Keckler,et al.  Page Placement Strategies for GPUs within Heterogeneous Memory Systems , 2015, ASPLOS.

[13]  Thomas F. Wenisch,et al.  Thermostat: Application-transparent Page Management for Two-tiered Main Memory , 2017, ASPLOS.

[14]  Avinash Sodani,et al.  Knights landing (KNL): 2nd Generation Intel® Xeon Phi processor , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).

[15]  Johannes G. Janzen Calculating Memory System Power for DDR SDRAM , 2001 .

[16]  Jeff R. Hammond,et al.  User Extensible Heap Manager for Heterogeneous Memory Platforms and Mixed Memory Policies , 2015 .

[17]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[18]  Vivien Quéma,et al.  Traffic management: a holistic approach to memory placement on NUMA systems , 2013, ASPLOS '13.

[19]  Karthick Rajamani,et al.  Energy Management for Commercial Servers , 2003, Computer.

[20]  Mahmut T. Kandemir,et al.  Scheduler-based DRAM energy management , 2002, DAC '02.

[21]  Prasad A. Kulkarni,et al.  Cross-layer memory management for managed language applications , 2015, OOPSLA.

[22]  Bishop Brock,et al.  Architecting for power management: The IBM® POWER7™ approach , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[23]  Chris Mason,et al.  Transcendent Memory and Linux , 2006 .

[24]  Alvin R. Lebeck,et al.  Power aware page allocation , 2000, SIGP.

[25]  Sanjeev Kumar,et al.  Dynamic tracking of page miss ratio curve for memory management , 2004, ASPLOS XI.

[26]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[27]  David Roberts,et al.  Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[28]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[29]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .

[30]  Shirley Moore,et al.  Non-determinism and overcount on modern hardware performance counter implementations , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[31]  Hai Jin,et al.  NightWatch: Integrating Lightweight and Transparent Cache Pollution Control into Dynamic Memory Allocation Systems , 2015, USENIX Annual Technical Conference.

[32]  Bruce Jacob,et al.  DRAM Refresh Mechanisms, Penalties, and Trade-Offs , 2016, IEEE Transactions on Computers.

[33]  Kang G. Shin,et al.  Design and Implementation of Power-Aware Virtual Memory , 2003, USENIX ATC, General Track.