On-line Application Autotuning Exploiting Ensemble Models

Application autotuning is a promising path investigated in literature to improve computation efficiency. In this context, the end-users define high-level requirements and an autonomic manager is able to identify and seize optimization opportunities by leveraging trade-offs between extra-functional properties of interest, such as execution time, power consumption or quality of results. The relationship between an application configuration and the extra-functional properties might depend on the underlying architecture, on the system workload and on features of the current input. For these reasons, autotuning frameworks rely on application knowledge to drive the adaptation strategies. The autotuning task is typically done offline because having it in production requires significant effort to reduce its overhead. In this paper, we enhance a dynamic autotuning framework with a module for learning the application knowledge during the production phase, in a distributed fashion. We leverage two strategies to limit the overhead introduced at the production phase. On one hand, we use a scalable infrastructure capable of leveraging the parallelism of the underlying platform. On the other hand, we use ensemble models to speed up the predictive capabilities, while iteratively gathering production data. Experimental results on synthetic applications and on a use case show how the proposed approach is able to learn the application knowledge, by exploring a small fraction of the design space.

[1]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[2]  Helmar Burkhart,et al.  PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[3]  Walter F. Tichy,et al.  Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications , 2009, Euro-Par.

[4]  Prasanna Balaprakash,et al.  Autotuning in High-Performance Computing Applications , 2018, Proceedings of the IEEE.

[5]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[6]  Westley Weimer,et al.  Automatically Exploring Tradeoffs Between Software Output Fidelity and Energy Costs , 2019, IEEE Transactions on Software Engineering.

[7]  Thomas J. Santner,et al.  Design and analysis of computer experiments , 1998 .

[8]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[9]  D. Goodin The cambridge dictionary of statistics , 1999 .

[10]  Scott A. Mahlke,et al.  Input responsiveness: using canary inputs to dynamically steer approximation , 2016, PLDI.

[11]  Martin Rinard,et al.  Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures , 2009 .

[12]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[13]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[14]  Martin C. Rinard Probabilistic accuracy bounds for fault-tolerant computations that discard tasks , 2006, ICS '06.

[15]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[17]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[18]  Keshav Pingali,et al.  Proactive Control of Approximate Programs , 2016, ASPLOS.

[19]  Yves Deville,et al.  DiceKriging, DiceOptim: Two R Packages for the Analysis of Computer Experiments by Kriging-Based Metamodeling and Optimization , 2012 .

[20]  Kalyan Veeramachaneni,et al.  Autotuning algorithmic choice for input sensitivity , 2015, PLDI.

[21]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[22]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[23]  Young K. Truong,et al.  Polynomial splines and their tensor products in extended linearmodeling , 1997 .

[24]  Katherine Yelick,et al.  OSKI: A library of automatically tuned sparse matrix kernels , 2005 .

[25]  Carlo Cavazzoni,et al.  LiGen: A High Performance Workflow for Chemistry Driven de Novo Design , 2013, J. Chem. Inf. Model..

[26]  Sergei Gorlatch,et al.  ATF: A Generic Auto-Tuning Framework , 2017, 2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[27]  Jie Shen,et al.  Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms , 2013, CF '13.

[28]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[29]  Gianluca Palermo,et al.  mARGOt: A Dynamic Autotuning Framework for Self-Aware Approximate Computing , 2019, IEEE Transactions on Computers.

[30]  Frank Kursawe,et al.  A Variant of Evolution Strategies for Vector Optimization , 1990, PPSN.

[31]  T. M. Burgess,et al.  Optimal interpolation and isarithmic mapping of soil properties. III. Changing drift and universal kriging. , 1980 .

[32]  To Thanh Binh A Multiobjective Evolutionary Algorithm - The Study Cases , 1999 .

[33]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[34]  Shoaib Ashraf Kamil,et al.  Productive High Performance Parallel Programming with Auto-tuned Domain-Specific Embedded Languages , 2012 .

[35]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[36]  Shoaib Kamil,et al.  OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[37]  Martin C. Rinard,et al.  Parallelizing Sequential Programs with Statistical Accuracy Tests , 2013, TECS.

[38]  Haipeng Guo A Bayesian Approach for Automatic Algorithm Selection , 2003 .

[39]  Céline Helbert,et al.  DiceDesign and DiceEval: Two R Packages for Design and Analysis of Computer Experiments , 2015 .

[40]  Carlo Cavazzoni,et al.  Use of Experimental Design To Optimize Docking Performance: The Case of LiGenDock, the Docking Module of Ligen, a New De Novo Design Program , 2013, J. Chem. Inf. Model..

[41]  Scott A. Mahlke,et al.  Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.

[42]  Danny Weyns,et al.  A systematic literature review on methods that handle multiple quality attributes in architecture-based self-adaptive systems , 2017, Inf. Softw. Technol..

[43]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[44]  José M. F. Moura,et al.  Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms , 2004, Int. J. High Perform. Comput. Appl..

[45]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[46]  Cedric Nugteren,et al.  CLTune: A Generic Auto-Tuner for OpenCL Kernels , 2015, 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip.

[47]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[48]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.