Memory Cocktail Therapy: A General Learning-Based Framework to Optimize Dynamic Tradeoffs in NVMs

Non-volatile memories (NVMs) have attracted significant interest recently due to their high-density, low static power, and persistence. There are, however, several challenges associated with building practical systems from NVMs, including limited write endurance and long latencies. Researchers have proposed a variety of architectural techniques which can achieve different tradeoffs between lifetime, performance and energy efficiency; however, no individual technique can satisfy requirements for all applications and different objectives. Hence, we propose Memory Cocktail Therapy (MCT), a general learning-based framework that adaptively chooses the best techniques for the current application and objectives.Specifically, MCT performs four procedures to adapt the techniques to various scenarios. First, MCT formulates a high-dimensional configuration space from all different combinations of techniques. Second, MCT selects primary features from the configuration space with lasso regularization. Third, MCT estimates lifetime, performance and energy consumption using lightweight online predictors (eg. quadratic regression and gradient boosting) and a small set of configurations guided by the selected features. Finally, given the estimation of all configurations, MCT selects the optimal configuration based on the user-defined objectives. As a proof of concept, we test MCT’s ability to guarantee different lifetime targets and achieve 95% of maximum performance, while minimizing energy consumption. We find that MCT improves performance by 9.24% and reduces energy by 7.95% compared to the best static configuration. Moreover, the performance of MCT is 94.49% of the ideal configuration with only 5.3% more energy consumption. CCS CONCEPTS • Hardware → Memory and dense storage.

[1]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[2]  Gary S. Tyson,et al.  Eager writeback-a technique for improving bandwidth utilization , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[3]  Daniel A. Jiménez,et al.  Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[4]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[5]  J. Friedman Stochastic gradient boosting , 2002 .

[6]  Daniel A. Menascé Workload Characterization , 2003, IEEE Internet Comput..

[7]  Sandhya Dwarkadas,et al.  Characterizing and predicting program behavior and its variability , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[8]  Brad Calder,et al.  Phase tracking and prediction , 2003, ISCA '03.

[9]  Chen Ding,et al.  Locality phase prediction , 2004, ASPLOS XI.

[10]  Lieven Eeckhout,et al.  Method-level phase behavior in java workloads , 2004, OOPSLA.

[11]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[12]  David M. Brooks,et al.  Accurate and efficient regression modeling for microarchitectural performance and power prediction , 2006, ASPLOS XII.

[13]  Aamer Jaleel,et al.  Adaptive insertion policies for high performance caching , 2007, ISCA '07.

[14]  Engin Ipek,et al.  Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[15]  Onur Mutlu,et al.  Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.

[16]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[17]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[18]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  YangJun,et al.  A durable and energy efficient main memory using phase change memory technology , 2009 .

[20]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[21]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[22]  Yiran Chen,et al.  A novel architecture of the 3D stacked MRAM L2 cache for CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[23]  Moinuddin K. Qureshi,et al.  Improving read performance of Phase Change Memories via Write Cancellation and Write Pausing , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[24]  Karin Strauss,et al.  Use ECP, not ECC, for hard failures in resistive memories , 2010, ISCA.

[25]  Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems , 2010, ASPLOS XV.

[26]  Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems , 2010, ASPLOS.

[27]  Michael F. P. O'Boyle,et al.  A Predictive Model for Dynamic Microarchitectural Adaptivity Control , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[28]  Bradford M. Beckmann,et al.  The gem5 simulator , 2011, CARN.

[29]  Karin Strauss,et al.  Preventing PCM banks from seizing too much power , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[30]  Luis A. Lastras,et al.  PreSET: Improving performance of phase change memories by exploiting asymmetry in write times , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[31]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[32]  Matthew Poremba,et al.  NVMain: An Architectural-Level Main Memory Simulator for Emerging Non-volatile Memories , 2012, 2012 IEEE Computer Society Annual Symposium on VLSI.

[33]  Alípio Mário Jorge,et al.  Ensemble approaches for regression: A survey , 2012, CSUR.

[34]  Owen Y Loh,et al.  Nanoelectromechanical contact switches. , 2012, Nature nanotechnology.

[35]  Yanxiang He,et al.  Compiler directed write-mode selection for high performance low power volatile PCM , 2013, LCTES '13.

[36]  Dmitri B. Strukov,et al.  Memristors for neural branch prediction: a case study in strict latency and write endurance challenges , 2013, CF '13.

[37]  Mikko H. Lipasti,et al.  Bias-Free Branch Predictor , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[38]  Yu Hu,et al.  Partial-SET: Write speedup of PCM main memory , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[39]  Sungjin Lee,et al.  Lifetime improvement of NAND flash-based storage systems using dynamic program and erase scaling , 2014, FAST.

[40]  Dmitri B. Strukov,et al.  SpongeDirectory: Flexible sparse directories utilizing multi-level memristors , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[41]  Moinuddin K. Qureshi,et al.  Reducing read latency of phase change memory via early read and Turbo Read , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[42]  Tao Zhang,et al.  Overcoming the challenges of crossbar resistive memory architectures , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[43]  Henry Hoffmann,et al.  A Probabilistic Graphical Model-based Approach for Minimizing Energy Under Performance Constraints , 2015, ASPLOS.

[44]  Xiaojin Zhu,et al.  Cross-architecture performance prediction (XAPP) using CPU code to predict GPU performance , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[45]  Frederic T. Chong,et al.  Herniated Hash Tables: Exploiting Multi-Level Phase Change Memory for In-Place Data Expansion , 2015, MEMSYS.

[46]  Henry Hoffmann,et al.  JouleGuard: energy guarantees for approximate applications , 2015, SOSP.

[47]  Dmitri B. Strukov,et al.  Mellow Writes: Extending Lifetime in Resistive Memories through Selective Slow Write Backs , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[48]  Jun Yang,et al.  ReadDuo: Constructing Reliable MLC Phase Change Memory through Fast and Robust Readout , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[49]  Henry Hoffmann,et al.  CASH: Supporting IaaS Customers with a Sub-core Configurable Architecture , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[50]  Zhe Wang,et al.  Perceptron learning for reuse prediction , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[51]  Frederic T. Chong,et al.  Balancing Performance and Lifetime of MLC PCM by Using a Region Retention Monitor , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[52]  Stuart A. Kurtz,et al.  Lemonade from lemons: Harnessing device wearout to create limited-use security architectures , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[53]  Henry Hoffmann,et al.  ESP: A Machine Learning Approach to Predicting Application Interference , 2017, 2017 IEEE International Conference on Autonomic Computing (ICAC).