AXES: Approximation Manager for Emerging Memory Architectures

Memory approximation techniques are commonly limited in scope, targeting individual levels of the memory hierarchy. Existing approximation techniques for a full memory hierarchy determine optimal configurations at design-time provided a goal and application. Such policies are rigid: they cannot adapt to unknown workloads and must be redesigned for different memory configurations and technologies. We propose AXES: the first self-optimizing runtime manager for coordinating configurable approximation knobs across all levels of the memory hierarchy. AXES continuously updates and optimizes its approximation management policy throughout runtime for diverse workloads. AXES optimizes the approximate memory configuration to minimize power consumption without compromising the quality threshold specified by application developers. AXES can (1) learn a policy at runtime to manage variable application quality of service (QoS) constraints, (2) automatically optimize for a target metric within those constraints, and (3) coordinate runtime decisions for interdependent knobs and subsystems. We demonstrate AXES' ability to efficiently provide functions 1-3 on a RISC-V Linux platform with approximate memory segments in the on-chip cache and main memory. We demonstrate AXES' ability to save up to 37% energy in the memory subsystem without any design-time overhead. We show AXES' ability to reduce QoS violations by 75% with $<5\%$ additional energy.

[1]  Dan Grossman,et al.  Monitoring and Debugging the Quality of Results in Approximate Programs , 2015, ASPLOS.

[2]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[3]  Darko Marinov,et al.  gem5-Approxilyzer: An Open-Source Tool for Application-Level Soft Error Analysis , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[4]  Jae W. Lee,et al.  eDRAM-based Tiered-Reliability Memory with applications to low-power frame buffers , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[5]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[6]  Jiajia Jiao,et al.  HEAP: A Holistic Error Assessment Framework for Multiple Approximations Using Probabilistic Graphical Models , 2020, Electronics.

[7]  Stijn Eyerman,et al.  An Evaluation of High-Level Mechanistic Core Models , 2014, ACM Trans. Archit. Code Optim..

[8]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[9]  Nikil Dutt,et al.  Self-Adaptive Memory Approximation: A Formal Control Theory Approach , 2020, IEEE Embedded Systems Letters.

[10]  Osman Hasan,et al.  Machine Learning-Based Self-Compensating Approximate Computing , 2020, 2020 IEEE International Systems Conference (SysCon).

[11]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Onur Mutlu,et al.  EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM , 2019, MICRO.

[13]  Mario Badr,et al.  Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[14]  Mircea R. Stan,et al.  Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[15]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Sarita V. Adve,et al.  Approxilyzer: Towards a systematic framework for instruction-level approximate computing and its application to hardware resiliency , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[17]  Rudolf Eigenmann,et al.  Harnessing Parallelism in Multicore Systems to Expedite and Improve Function Approximation , 2016, LCPC.

[18]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[19]  Yunsup Lee,et al.  The RISC-V Instruction Set Manual , 2014 .

[20]  Alan Edelman,et al.  Language and compiler support for auto-tuning variable-accuracy algorithms , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[21]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22]  Kaushik Roy,et al.  Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[23]  Glenn Reinman,et al.  BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[24]  Bryan Donyanavard,et al.  SOSA: Self-Optimizing Learning with Self-Adaptive Control for Hierarchical System-on-Chip Management , 2019, MICRO.

[25]  Mahmut T. Kandemir,et al.  Distilling the Essence of Raw Video to Reduce Memory Usage and Energy at Edge Devices , 2019, MICRO.

[26]  Jonathan Balkind,et al.  OpenPiton + Ariane : The First Open-Source , SMP Linux-booting RISC-V System Scaling From One to Many Cores , 2019 .

[27]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[28]  Gernot Heiser,et al.  An Analysis of Power Consumption in a Smartphone , 2010, USENIX Annual Technical Conference.

[29]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[30]  Nikil D. Dutt,et al.  QuARK: Quality-configurable approximate STT-MRAM cache by fine-grained tuning of reliability-energy knobs , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[31]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[32]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[33]  Chundong Wang,et al.  ASAC: automatic sensitivity analysis for approximate computing , 2014, LCTES '14.

[34]  Luis Ceze,et al.  General-purpose code acceleration with limited-precision analog computation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[35]  Sergio Bampi,et al.  Approximation-aware Multi-Level Cells STT-RAM cache architecture , 2015, 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).

[36]  Andreas Herkersdorf,et al.  Memory Access Pattern Profiling for Streaming Applications Based on MATLAB Models , 2018, 2018 28th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS).

[37]  Nikil D. Dutt,et al.  Exploiting Partially-Forgetful Memories for Approximate Computing , 2015, IEEE Embedded Systems Letters.

[38]  Mehdi Kamal,et al.  DART: A Framework for Determining Approximation Levels in an Approximable Memory Hierarchy , 2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[39]  Hadi Esmaeilzadeh,et al.  AxBench: A Multiplatform Benchmark Suite for Approximate Computing , 2017, IEEE Design & Test.

[40]  Muhammad Shafique,et al.  AdAM: Adaptive approximation management for the non-volatile memory hierarchies , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[41]  Osman Hasan,et al.  Using Machine Learning for Quality Configurable Approximate Computing , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[42]  Jacob Nelson,et al.  SNNAP: Approximate computing on programmable SoCs via neural acceleration , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[43]  Luca Benini,et al.  The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[44]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).