Towards rigorous validation of energy optimisation experiments

The optimisation of software energy consumption is of growing importance across all scales of modern computing, i.e., from embedded systems to data-centres. Practitioners in the field of Search-Based Software Engineering and Genetic Improvement of Software acknowledge that optimising software energy consumption is difficult due to noisy and expensive fitness evaluations. However, it is apparent from results to date that more progress needs to be made in rigorously validating optimisation results. This problem is pressing because modern computing platforms have highly complex and variable behaviour with respect to energy consumption. To compare solutions fairly we propose in this paper a new validation approach called R3-validation which exercises software variants in a rotated-round-robin order. Using a case study, we present an in-depth analysis of the impacts of changing system states on software energy usage, and we show how R3-validation mitigates these. We compare it with current validation approaches across multiple devices and operating systems, and we show that it aligns best with actual platform behaviour.

[1]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[2]  Andreas Zeller,et al.  The Truth, The Whole Truth, and Nothing But the Truth , 2016, ACM Trans. Program. Lang. Syst..

[3]  Mark Harman,et al.  Genetic Improvement of Software: A Comprehensive Survey , 2018, IEEE Transactions on Evolutionary Computation.

[4]  Lieven Eeckhout,et al.  Java performance evaluation through rigorous replay compilation , 2008, OOPSLA.

[5]  S McKinleyKathryn,et al.  The garbage collection advantage , 2004 .

[6]  Gabriele Bavota,et al.  Optimizing energy consumption of GUIs in Android apps: a multi-objective approach , 2015, ESEC/SIGSOFT FSE.

[7]  Alexander E. I. Brownlee,et al.  Object-Oriented Genetic Improvement for Improved Energy Consumption in Google Guava , 2015, SSBSE.

[8]  Laxmi N. Bhuyan,et al.  Thread Tranquilizer: Dynamically reducing performance variation , 2012, TACO.

[9]  Fengyuan Xu,et al.  V-edge: Fast Self-constructive Power Modeling of Smartphones Based on Battery Voltage Dynamics , 2013, NSDI.

[10]  Markus Wagner,et al.  Deep parameter optimisation on Android smartphones for energy minimisation: a tale of woe and a proof-of-concept , 2017, GECCO.

[11]  Ding Li,et al.  Making web applications more energy efficient for OLED smartphones , 2014, ICSE.

[12]  Petr Tuma,et al.  Benchmark Precision and Random Initial State , 2005 .

[13]  Markus Wagner,et al.  In-vivo and offline optimisation of energy use in the presence of small energy signals: A case study on a popular Android library , 2018, MobiQuitous.

[14]  Tomas Kalibera,et al.  Rigorous benchmarking in reasonable time , 2013, ISMM '13.

[15]  Petr Tuma,et al.  Automated detection of performance regressions: the mono experience , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[16]  Markus Wagner,et al.  Validation of Internal Meters of Mobile Android Devices , 2017, ArXiv.

[17]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[18]  Westley Weimer,et al.  Post-compiler software optimization for reducing energy , 2014, ASPLOS.

[19]  Markus Wagner,et al.  Mind the gap – a distributed framework for enabling energy optimisation on modern smart-phones in the presence of noise, drift, and statistical insignificance , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[20]  KaliberaTomas,et al.  Rigorous benchmarking in reasonable time , 2013 .

[21]  Perry Cheng,et al.  The garbage collection advantage: improving program locality , 2004, OOPSLA.

[22]  David A. Wood,et al.  Variability in architectural simulations of multi-threaded workloads , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[23]  Lei Yang,et al.  Accurate online power estimation and automatic battery behavior based power model generation for smartphones , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[24]  David Robert White,et al.  Genetic programming for low-resource systems , 2009 .

[25]  Justyna Petke,et al.  Reducing Energy Consumption Using Genetic Improvement , 2015, GECCO.

[26]  J. Eliot B. Moss,et al.  Mark-copy: fast copying GC with less space overhead , 2003, OOPSLA '03.

[27]  Emery D. Berger,et al.  STABILIZER: statistically sound performance evaluation , 2013, ASPLOS '13.

[28]  Matthias Hauswirth,et al.  Producing wrong data without doing anything obviously wrong! , 2009, ASPLOS.

[29]  Matti Siekkinen,et al.  Smartphone Energy Consumption: Modeling and Optimization , 2014 .

[30]  Mark Harman,et al.  Approximate Oracles and Synergy in Software Energy Search Spaces , 2019, IEEE Transactions on Software Engineering.