Proactive Control of Approximate Programs

Approximate computing trades off accuracy of results for resources such as energy or computing time. There is a large and rapidly growing literature on approximate computing that has focused mostly on showing the benefits of approximate computing. However, we know relatively little about how to control approximation in a disciplined way. In this paper, we address the problem of controlling approximation for non-streaming programs that have a set of "knobs" that can be dialed up or down to control the level of approximation of different components in the program. We formulate this control problem as a constrained optimization problem, and describe a system called Capri that uses machine learning to learn cost and error models for the program, and uses these models to determine, for a desired level of approximation, knob settings that optimize metrics such as running time or energy usage. Experimental results with complex benchmarks from different problem domains demonstrate the effectiveness of this approach.

[1]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, MICRO-46.

[2]  Weiping Li,et al.  Applied Nonlinear Control , 1991 .

[3]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[4]  Gianluca Palermo,et al.  Application autotuning to support runtime adaptivity in multicore architectures , 2015, 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).

[5]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[6]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[7]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[8]  Zeyuan Allen Zhu,et al.  Randomized accuracy-aware program transformations for efficient approximate computations , 2012, POPL '12.

[9]  Thu D. Nguyen,et al.  ApproxHadoop: Bringing Approximations to MapReduce Frameworks , 2015, ASPLOS.

[10]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[11]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[12]  Swarat Chaudhuri,et al.  Smooth interpretation , 2010, PLDI '10.

[13]  Alan Edelman,et al.  Language and compiler support for auto-tuning variable-accuracy algorithms , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[14]  Martin C. Rinard,et al.  Verifying quantitative reliability for programs that execute on unreliable hardware , 2013, OOPSLA.

[15]  Mario Badr,et al.  Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[16]  Søren Højsgaard,et al.  Graphical Independence Networks with the gRain Package for R , 2012 .

[17]  Ming C. Lin,et al.  CLODs: Dual Hierarchies for Multiresolution Collision Detection , 2003, Symposium on Geometry Processing.

[18]  Judit Bar-Ilan,et al.  Methods for comparing rankings of search engine results , 2005, Comput. Networks.

[19]  Kevin W. Boyack,et al.  OpenOrd: an open-source toolbox for large graph layout , 2011, Electronic Imaging.

[20]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[21]  Luis Ceze,et al.  Neural Acceleration for General-Purpose Approximate Programs , 2014, IEEE Micro.

[22]  Michael Garland,et al.  Surface simplification using quadric error metrics , 1997, SIGGRAPH.

[23]  Kalyan Veeramachaneni,et al.  Autotuning algorithmic choice for input sensitivity , 2015, PLDI.

[24]  Dan Grossman,et al.  Monitoring and Debugging the Quality of Results in Approximate Programs , 2015, ASPLOS.

[25]  James Demmel,et al.  Precimonious: Tuning assistant for floating-point precision , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[26]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[27]  Martin C. Rinard Using early phase termination to eliminate load imbalances at barrier synchronization points , 2007, OOPSLA.

[28]  Scott A. Mahlke,et al.  Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.

[29]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[30]  Inderjit S. Dhillon,et al.  Scalable and Memory-Efficient Clustering of Large-Scale Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining.

[31]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[32]  Alan Edelman,et al.  PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.

[33]  Srinivas Devadas,et al.  Selecting Spatiotemporal Patterns for Development of Parallel Applications , 2012, IEEE Transactions on Parallel and Distributed Systems.

[34]  Dan Grossman,et al.  EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.

[35]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[36]  A. TUSTIN,et al.  Automatic Control Systems , 1950, Nature.

[37]  Alexander Aiken,et al.  Stochastic optimization of floating-point programs with tunable precision , 2014, PLDI.

[38]  Huawei Li,et al.  Performance Portability Across Heterogeneous SoCs Using a Generalized Library-Based Approach , 2014, TACO.

[39]  Nikil D. Dutt,et al.  Exploiting Partially-Forgetful Memories for Approximate Computing , 2015, IEEE Embedded Systems Letters.

[40]  Sumit Gulwani,et al.  Continuity and robustness of programs , 2012, CACM.

[41]  Krishna V. Palem,et al.  Energy aware computing through probabilistic switching: a study of limits , 2005, IEEE Transactions on Computers.

[42]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[43]  Gu-Yeon Wei,et al.  HELIX-UP: Relaxing program semantics to unleash parallelization , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[44]  Martin C. Rinard,et al.  Chisel: reliability- and accuracy-aware optimization of approximate computational kernels , 2014, OOPSLA.

[45]  Martin C. Rinard Probabilistic accuracy bounds for fault-tolerant computations that discard tasks , 2006, ICS '06.

[46]  Christian Bienia,et al.  Benchmarking modern multiprocessors , 2011 .