A Survey of Techniques for Approximate Computing

Approximate computing trades off computation quality with effort expended, and as rising performance demands confront plateauing resource budgets, approximate computing has become not merely attractive, but even imperative. In this article, we present a survey of techniques for approximate computing (AC). We discuss strategies for finding approximable program portions and monitoring output quality, techniques for using AC in different processing units (e.g., CPU, GPU, and FPGA), processor components, memory technologies, and so forth, as well as programming frameworks for AC. We classify these techniques based on several key characteristics to emphasize their similarities and differences. The aim of this article is to provide insights to researchers into working of AC techniques and inspire more efforts in this area to make AC the mainstream computing approach in future systems.

[1]  Qiang Xu,et al.  ApproxMA: Approximate Memory Access for Dynamic Precision Scaling , 2015, ACM Great Lakes Symposium on VLSI.

[2]  Luca Benini,et al.  Approximate associative memristive memory for energy-efficient GPUs , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[3]  Kaushik Roy,et al.  Quality programmable vector processors for approximate computing , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[4]  Sparsh Mittal,et al.  A survey of architectural techniques for DRAM power management , 2012, Int. J. High Perform. Syst. Archit..

[5]  Nikil D. Dutt,et al.  Exploiting Partially-Forgetful Memories for Approximate Computing , 2015, IEEE Embedded Systems Letters.

[6]  Kaushik Roy,et al.  AxNN: Energy-efficient neuromorphic systems using approximate computing , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[7]  Dong Li,et al.  A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-Volatile On-Chip Caches , 2015, IEEE Transactions on Parallel and Distributed Systems.

[8]  Henry Hoffmann,et al.  A Cross-Layer Multicore Architecture to Tradeoff Program Accuracy and Resilience Overheads , 2015, IEEE Computer Architecture Letters.

[9]  Anand Raghunathan,et al.  Best-effort computing: Re-thinking parallel software and hardware , 2010, Design Automation Conference.

[10]  Andrew B. Kahng,et al.  Accuracy-configurable adder for approximate arithmetic designs , 2012, DAC Design Automation Conference 2012.

[11]  Martin C. Rinard,et al.  Chisel: reliability- and accuracy-aware optimization of approximate computational kernels , 2014, OOPSLA.

[12]  Onur Mutlu,et al.  RFVP: Rollback-Free Value Prediction with Safe-to-Approximate Loads , 2016, ACM Trans. Archit. Code Optim..

[13]  Scott A. Mahlke,et al.  Rumba: An online quality management system for approximate computing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[14]  Sparsh Mittal,et al.  Power Management Techniques for Data Centers: A Survey , 2014, ArXiv.

[15]  Kathryn S. McKinley,et al.  Uncertain: a first-order type for uncertain data , 2014, ASPLOS.

[16]  Luca Benini,et al.  A variability-aware OpenMP environment for efficient execution of accuracy-configurable computation on shared-FPU processor clusters , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[17]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[18]  Yu Wang,et al.  RRAM-Based Analog Approximate Computing , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[19]  Kaushik Roy,et al.  IMPACT: IMPrecise adders for low-power approximate computing , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[20]  Kaushik Roy,et al.  Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[21]  Eric C. Kerrigan,et al.  More Flops or More Precision? Accuracy Parameterizable Linear Equation Solvers for Model Predictive Control , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[22]  Jae W. Lee,et al.  eDRAM-based Tiered-Reliability Memory with applications to low-power frame buffers , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[23]  Mario Badr,et al.  Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[24]  Andreas Peter Burg,et al.  Mitigating the impact of faults in unreliable memories for error-resilient applications , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[25]  Asit K. Mishra,et al.  iACT: A Software-Hardware Framework for Understanding the Scope of Approximate Computing , 2014 .

[26]  Peter A. Boncz,et al.  Main Memory , 2009, Encyclopedia of Database Systems.

[27]  Naresh R. Shanbhag,et al.  Reliable low-power digital signal processing via reduced precision redundancy , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[28]  Dimitrios S. Nikolopoulos,et al.  A programming model and runtime system for significance-aware energy-efficient computing , 2015, PPOPP.

[29]  Glenn Reinman,et al.  Dynamically adaptive and reliable approximate computing using light-weight error analysis , 2014, 2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS).

[30]  Sharad Malik,et al.  Extracting useful computation from error-prone processors for streaming applications , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[31]  Thu D. Nguyen,et al.  ApproxHadoop: Bringing Approximations to MapReduce Frameworks , 2015, ASPLOS.

[32]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[33]  John Sartori,et al.  Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications , 2012, IEEE Transactions on Multimedia.

[34]  Kunle Olukotun,et al.  EMEURO: A framework for generating multi-purpose accelerators via deep learning , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[35]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[36]  Naresh R. Shanbhag,et al.  Energy-efficient signal processing via algorithmic noise-tolerance , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[37]  Mark Sutherland,et al.  Texture Cache Approximation on GPUs , 2015 .

[38]  Kaushik Roy,et al.  SALSA: Systematic logic synthesis of approximate circuits , 2012, DAC Design Automation Conference 2012.

[39]  Martin C. Rinard,et al.  Verifying quantitative reliability for programs that execute on unreliable hardware , 2013, OOPSLA.

[40]  Kaushik Roy,et al.  Scalable Effort Hardware Design , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[41]  Naresh R. Shanbhag,et al.  Error-Resilient Motion Estimation Architecture , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[42]  Michael F. Ringenburg,et al.  Profiling and Autotuning for Energy-Aware Approximate Programming , 2012 .

[43]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[44]  Chundong Wang,et al.  ASAC: automatic sensitivity analysis for approximate computing , 2014, LCTES '14.

[45]  Luis Ceze,et al.  General-purpose code acceleration with limited-precision analog computation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[46]  Luis Ceze,et al.  Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[47]  Jacob Nelson,et al.  SNNAP: Approximate computing on programmable SoCs via neural acceleration , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[48]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[49]  Ajay Joshi,et al.  Neural network-based accelerators for transcendental function approximation , 2014, GLSVLSI '14.

[50]  Glenn Reinman,et al.  Accelerating divergent applications on SIMD architectures using neural networks , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[51]  Dan Grossman,et al.  Monitoring and Debugging the Quality of Results in Approximate Programs , 2015, ASPLOS.

[52]  Kaushik Roy,et al.  Approximate storage for energy efficient spintronic memories , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[53]  Surendra Byna,et al.  Best-effort semantic document search on GPUs , 2010, GPGPU-3.

[54]  Kia Bazargan,et al.  Axilog: Language support for approximate hardware design , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[55]  Jeffrey S. Vetter,et al.  Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing , 2015, Computing in Science & Engineering.

[56]  Ismail Akturk,et al.  On Quantification of Accuracy Loss in Approximate Computing , 2015 .

[57]  Jeffrey S. Vetter,et al.  A Survey of Techniques for Modeling and Improving Reliability of Computing Systems , 2016, IEEE Transactions on Parallel and Distributed Systems.

[58]  Luca Benini,et al.  Spatial Memoization: Concurrent Instruction Reuse to Correct Timing Errors in SIMD Architectures , 2013, IEEE Transactions on Circuits and Systems II: Express Briefs.

[59]  이강현,et al.  고속 통신을 위한 Floating Point Unit 설계 , 1999 .

[60]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[61]  Yiannis Andreopoulos,et al.  Precision-energy-throughput scaling of generic matrix multiplication and discrete convolution kernels via linear projections , 2013, The 11th IEEE Symposium on Embedded Systems for Real-time Multimedia.

[62]  Alan Edelman,et al.  Language and compiler support for auto-tuning variable-accuracy algorithms , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[63]  M. Valero,et al.  Fuzzy memoization for floating-point multimedia applications , 2005, IEEE Transactions on Computers.

[64]  Olivier Temam,et al.  Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[65]  Kaushik Roy,et al.  Approximate computing and the quest for computing efficiency , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[66]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[67]  Hang Zhang,et al.  Low power GPGPU computation with imprecise hardware , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[68]  John Augustine,et al.  Opportunities for energy efficient computing: A study of inexact general purpose processors for high-performance and big-data applications , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[69]  Xin Xu,et al.  Exploring Data-Level Error Tolerance in High-Performance Solid-State Drives , 2015, IEEE Transactions on Reliability.

[70]  Qiang Xu,et al.  ApproxANN: An approximate computing framework for artificial neural network , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[71]  Jie Liu,et al.  Scalable-effort classifiers for energy-efficient machine learning , 2015, DAC.

[72]  Scott A. Mahlke,et al.  Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.

[73]  Glenn Reinman,et al.  BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[74]  Anand Raghunathan,et al.  Quality configurable reduce-and-rank for energy efficient approximate computing , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[75]  Sparsh Mittal,et al.  A Survey of Architectural Techniques for Near-Threshold Computing , 2015, ACM J. Emerg. Technol. Comput. Syst..

[76]  Huawei Li,et al.  SoftPCM: Enhancing Energy Efficiency and Lifetime of Phase Change Memory in Video Applications via Approximate Write , 2012, 2012 IEEE 21st Asian Test Symposium.

[77]  Puneet Gupta,et al.  Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[78]  Sparsh Mittal,et al.  A survey of architectural techniques for improving cache power efficiency , 2014, Sustain. Comput. Informatics Syst..

[79]  Mehrzad Samadi,et al.  CPU-GPU Collaboration for Output Quality Monitoring , 2014 .

[80]  Chen-Yu Chen,et al.  Energy-aware hybrid precision selection framework for mobile GPUs , 2013, Comput. Graph..

[81]  P. Laguna,et al.  Signal Processing , 2002, Yearbook of Medical Informatics.

[82]  Milos D. Ercegovac,et al.  The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics Acceleration , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[83]  Hadi Esmaeilzadeh,et al.  Prediction-Based Quality Control for Approximate Accelerators , 2015 .

[84]  Dan Grossman,et al.  EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.