A programming model and runtime system for significance-aware energy-efficient computing

We introduce a task-based programming model and runtime system that exploit the observation that not all parts of a program are equally significant for the accuracy of the end-result, in order to trade off the quality of program outputs for increased energy-efficiency. This is done in a structured and flexible way, allowing for easy exploitation of different points in the quality/energy space, without adversely affecting application performance. The runtime system can apply a number of different policies to decide whether it will execute less-significant tasks accurately or approximately. The experimental evaluation indicates that our system can achieve an energy reduction of up to 83% compared with a fully accurate execution and up to 35% compared with an approximate version employing loop perforation. At the same time, our approach always results in graceful quality degradation.

[1]  Alejandro Duran,et al.  Ompss: a Proposal for Programming Heterogeneous Multi-Core Architectures , 2011, Parallel Process. Lett..

[2]  John Sartori,et al.  On software design for stochastic processors , 2012, DAC Design Automation Conference 2012.

[3]  Zeyuan Allen Zhu,et al.  Randomized accuracy-aware program transformations for efficient approximate computations , 2012, POPL '12.

[4]  Polyvios Pratikakis,et al.  BDDT: Block-Level Dynamic Dependence Analysis for Task-Based Parallelism , 2013, APPT.

[5]  Scott A. Mahlke,et al.  Scaling Performance via Self-Tuning Approximation for Graphics Engines , 2014, TOCS.

[6]  Luca Benini,et al.  A variability-aware OpenMP environment for efficient execution of accuracy-configurable computation on shared-FPU processor clusters , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[7]  Luis Ceze,et al.  Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[8]  Touradj Ebrahimi,et al.  The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[9]  Gerhard Wellein,et al.  LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[10]  Kaushik Roy,et al.  Quality programmable vector processors for approximate computing , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[11]  Michael Engel,et al.  Improving the fault resilience of an H.264 decoder using static analysis methods , 2013, TECS.

[12]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[13]  Chundong Wang,et al.  ASAC: automatic sensitivity analysis for approximate computing , 2014, LCTES '14.

[14]  Subhasish Mitra,et al.  ERSA: Error Resilient System Architecture for probabilistic applications , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[15]  Dimitrios S. Nikolopoulos,et al.  On the potential of significance-driven execution for energy-aware HPC , 2014, Computer Science - Research and Development.

[16]  S NikolopoulosDimitrios,et al.  A programming model and runtime system for significance-aware energy-efficient computing , 2015 .

[17]  Scott A. Mahlke,et al.  Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.

[18]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[19]  Dan Grossman,et al.  EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.

[20]  Yue Wang,et al.  Half-Wits: Software Techniques for Low-Voltage Probabilistic Storage on Microcontrollers with NOR Flash Memory , 2013, TECS.

[21]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22]  Qiang Xu,et al.  ApproxIt: An approximate computing framework for iterative methods , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[23]  Manolis Vavalis Hybrid-numerical-PDE-solvers: Hybrid Elliptic PDE Solvers , 2014 .

[24]  Luca Benini,et al.  Variation-tolerant OpenMP tasking on tightly-coupled processor clusters , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[25]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[26]  Martin C. Rinard,et al.  Parallelizing Sequential Programs with Statistical Accuracy Tests , 2013, TECS.

[27]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[28]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).