BoGraph: Structured Bayesian Optimization From Logs for Systems with High-dimensional Parameter Space

Current auto-tuning frameworks struggle with tuning computer systems configurations due to their large parameter space, complex interdependencies, and high evaluation cost. Utilizing probabilistic models, Structured Bayesian Optimization (SBO) [18] has recently overcome these difficulties. SBO decomposes the parameter space by utilizing contextual information provided by system experts leading to fast convergence. However, the complexity of building probabilistic models has hindered its wider adoption. We propose BoGraph, a SBO framework that learns the system structure from its logs. BoGraph provides an API enabling experts to encode their knowledge of the system as performance models or components dependency. BoGraph takes in the learned structure and transforms it into a probabilistic graph model. Then it applies the expert-provided knowledge to the graph to further contextualize the system behavior. BoGraph probabilistic graph allows the optimizer to find efficient configurations faster than other methods.We evaluate BoGraph via a hardware architecture search problem, achieving an improvement in energy-latency objectives ranging from 5− 7 x-factors improvement over the default architecture. With its novel contextual structure learning pipeline, BoGraph makes using SBO accessible for a wide range of other computer systems such as databases and stream processors.

[1]  Kevin Leyton-Brown,et al.  An evaluation of sequential model-based optimization for expensive blackbox functions , 2013, GECCO.

[2]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[3]  Noah D. Goodman,et al.  Pyro: Deep Universal Probabilistic Programming , 2018, J. Mach. Learn. Res..

[4]  Ding Yuan,et al.  Characterizing logging practices in open-source software , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[5]  Gu-Yeon Wei,et al.  SMAUG , 2019, ACM Trans. Archit. Code Optim..

[6]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[7]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[8]  Kunle Olukotun,et al.  Practical Design Space Exploration , 2018, 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).

[9]  Thierry Moreau,et al.  Learning to Optimize Tensor Programs , 2018, NeurIPS.

[10]  Qiang Fu,et al.  Where do developers log? an empirical study on logging practices in industry , 2014, ICSE Companion.

[11]  Sriram Rao,et al.  Dhalion: Self-Regulating Stream Processing in Heron , 2017, Proc. VLDB Endow..

[12]  Isis Truck,et al.  Using Reinforcement Learning for Autonomic Resource Allocation in Clouds: towards a fully automated workflow , 2011 .

[13]  Xiaoyu Lu,et al.  Causal Bayesian Optimization , 2020, AISTATS.

[14]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[15]  Fabio Stella,et al.  A survey on Bayesian network structure learning from data , 2019, Progress in Artificial Intelligence.

[16]  Gu-Yeon Wei,et al.  MachSuite: Benchmarks for accelerator design and customized architectures , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[17]  Daniel R. Jiang,et al.  BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization , 2020, NeurIPS.

[18]  Mark Horowitz,et al.  Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis , 2010, ISCA.

[19]  Andrew Pavlo,et al.  An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems , 2021, Proc. VLDB Endow..

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Vivienne Sze,et al.  Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[22]  Bruce Momjian,et al.  PostgreSQL: Introduction and Concepts , 2000 .

[23]  Adnan Darwiche,et al.  Modeling and Reasoning with Bayesian Networks , 2009 .

[24]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[25]  Gu-Yeon Wei,et al.  Determining Optimal Coherency Interface for Many-Accelerator SoCs Using Bayesian Optimization , 2019, IEEE Computer Architecture Letters.

[26]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[27]  Gu-Yeon Wei,et al.  Co-designing accelerators and SoC interfaces using gem5-Aladdin , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[28]  D. Mackay,et al.  Bayesian neural networks and density networks , 1995 .

[29]  Li Da-qing Survey of Bayesian network inference algorithms , 2008 .

[30]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[31]  Paul Kline,et al.  An easy guide to factor analysis , 1993 .

[32]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[33]  Samy Bengio,et al.  Device Placement Optimization with Reinforcement Learning , 2017, ICML.

[34]  Niall Murphy,et al.  Site Reliability Engineering: How Google Runs Production Systems , 2016 .

[35]  Andy D. Pimentel Exploring Exploration: A Tutorial Introduction to Embedded Systems Design Space Exploration , 2017, IEEE Design & Test.

[36]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[37]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[38]  Gu-Yeon Wei,et al.  Aladdin: A pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[39]  Kirthevasan Kandasamy,et al.  RubberBand: cloud-based hyperparameter tuning , 2021, EuroSys.

[40]  Andrew Gordon Wilson,et al.  GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , 2018, NeurIPS.

[41]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[42]  Andy J. Keane,et al.  Engineering Design via Surrogate Modelling - A Practical Guide , 2008 .

[43]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[44]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[45]  Ion Stoica,et al.  Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics , 2016, NSDI.

[46]  Max Jaderberg,et al.  Population Based Training of Neural Networks , 2017, ArXiv.

[47]  Maximilian Balandat,et al.  Differentiable Expected Hypervolume Improvement for Parallel Multi-Objective Bayesian Optimization , 2020, NeurIPS.

[48]  Valentin Dalibard,et al.  BOAT: Building Auto-Tuners with Structured Bayesian Optimization , 2017, WWW.

[49]  K. Perez Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment , 2014 .

[50]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[51]  Minlan Yu,et al.  CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.

[52]  Judea Pearl,et al.  The Do-Calculus Revisited , 2012, UAI.

[53]  Geoffrey J. Gordon,et al.  Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.

[54]  Kuo-Chu Chang,et al.  Comparison of score metrics for Bayesian network learning , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[55]  Valentin Dalibard,et al.  A framework to build bespoke auto-tuners with structured Bayesian optimisation , 2017 .

[56]  Pradeep Ravikumar,et al.  DAGs with NO TEARS: Continuous Optimization for Structure Learning , 2018, NeurIPS.