A Taxonomy of Approximate Computing Techniques

Approximate computing is the idea that systems can gain performance and energy efficiency if they expend less effort on producing a “perfect” answer. Approximate computing techniques propose various ways of exposing and exploiting accuracy– efficiency trade-offs. We present a taxonomy that classifies approximate computing techniques according to their most salient features: compute vs. data, deterministic vs. nondeterministic and coarsevs. fine-grained. These axes allow us to address questions about the visibility, testability and flexibility of different techniques. We use this taxonomy to inform future research in approximate architectures, compilers and applications that will catalyze mainstream adoption of approximate computing.

[1]  David R. Kaeli,et al.  A Taxonomy to Enable Error Recovery and Correction in Software , 2008 .

[2]  Jacob Nelson,et al.  SNNAP: Approximate computing on programmable SoCs via neural acceleration , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[3]  Dan Grossman,et al.  Expressing and verifying probabilistic assertions , 2014, PLDI.

[4]  Asit K. Mishra,et al.  iACT: A Software-Hardware Framework for Understanding the Scope of Approximate Computing , 2014 .

[5]  Natalie D. Enright Jerger,et al.  Doppelgänger: A cache for approximate computing , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  Stelios Sidiroglou,et al.  Dancing with uncertainty , 2012, RACES '12.

[7]  M. Valero,et al.  Fuzzy memoization for floating-point multimedia applications , 2005, IEEE Transactions on Computers.

[8]  Milos D. Ercegovac,et al.  The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics Acceleration , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[9]  Alan Edelman,et al.  PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.

[10]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[11]  Kaushik Roy,et al.  Quality programmable vector processors for approximate computing , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Paul Chow,et al.  Compile-time and instruction-set methods for improving floating- to fixed-point conversion accuracy , 2008, TECS.

[13]  Quinn Jacobson,et al.  ERSA: error resilient system architecture for probabilistic applications , 2010, DATE 2010.

[14]  Kaushik Roy,et al.  Substitute-and-simplify: A unified design paradigm for approximate and quality configurable circuits , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Karthikeyan Sankaralingam,et al.  Relax: an architectural framework for software recovery of hardware faults , 2010, ISCA.

[16]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[17]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[18]  Rob A. Rutenbar,et al.  Reducing power by optimizing the necessary precision/range of floating-point arithmetic , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[19]  Vijayalakshmi Srinivasan,et al.  Programming with relaxed synchronization , 2012, RACES '12.

[20]  Zheng Li,et al.  Continuous real-world inputs can open up alternative accelerator designs , 2013, ISCA.

[21]  Sharad Malik,et al.  Extracting useful computation from error-prone processors for streaming applications , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[22]  Kaushik Roy,et al.  A Priority-Based 6T/8T Hybrid SRAM Architecture for Aggressive Voltage Scaling in Video Applications , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Luis Ceze,et al.  General-purpose code acceleration with limited-precision analog computation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[24]  Emina Torlak,et al.  Optimizing synthesis with metasketches , 2016, POPL.

[25]  Dan Grossman,et al.  Monitoring and Debugging the Quality of Results in Approximate Programs , 2015, ASPLOS.

[26]  Mark Sutherland,et al.  Texture Cache Approximation on GPUs , 2015 .

[27]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[28]  Onur Mutlu,et al.  Rollback-free value prediction with approximate loads , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[29]  Ion Stoica,et al.  BlinkDB: queries with bounded errors and bounded response times on very large data , 2012, EuroSys '13.

[30]  Mario Badr,et al.  Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[31]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[32]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[33]  Krishna V. Palem,et al.  Probabilistic system-on-a-chip architectures , 2007, TODE.

[34]  Luis Ceze,et al.  Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[35]  T. Mudge,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.

[36]  James Demmel,et al.  Precimonious: Tuning assistant for floating-point precision , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[37]  Glenn Reinman,et al.  Accelerating divergent applications on SIMD architectures using neural networks , 2014, ICCD.

[38]  Scott A. Mahlke,et al.  Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.

[39]  Puneet Gupta,et al.  Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[40]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[41]  Martin C. Rinard Parallel Synchronization-Free Approximate Data Structure Construction , 2013, HotPar.

[42]  David Gregg,et al.  A stochastic bitwidth estimation technique for compact and low-power custom processors , 2008, TECS.

[43]  Douglas L. Jones,et al.  Scalable stochastic processors , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[44]  Sharad Malik,et al.  CommGuard: Mitigating Communication Errors in Error-Prone Parallel Execution , 2015, ASPLOS.