Theta-Resonance: A Single-Step Reinforcement Learning Method for Design Space Exploration

—Given an environment (e.g., a simulator) for evaluating samples in a specified design space and a set of weighted eval- uation metrics—one can use θ -Resonance, a single-step Markov Decision Process (MDP), to train an intelligent agent producing progressively more optimal samples. In θ -Resonance, a neural network Net θ consumes a constant input tensor and produces a policy π θ as a set of conditional probability density functions (PDFs) for sampling each design dimension. We specialize existing policy gradient algorithms in deep reinforcement learning (D-RL) in order to use evaluation feedback (in terms of cost, penalty or reward) to update Net θ with robust algorithmic stability and minimal design evaluations. We study multiple neural architec- tures (for Net θ ) within the context of a simple SoC design space and propose a method of constructing synthetic space exploration problems to compare and improve design space exploration (DSE) algorithms. Although we only present categorical design spaces, we also outline how to use θ -Resonance in order to explore continuous and mixed continuous-discrete design spaces.

[1]  Yuan Xie,et al.  IronMan-Pro: Multiobjective Design Space Exploration in HLS via Reinforcement Learning and Graph Neural Network-Based Modeling , 2023, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  R. Kastner,et al.  Sherlock: A Multi-Objective Design Space Exploration Framework , 2022, ACM Trans. Design Autom. Electr. Syst..

[3]  Azalia Mirhoseini,et al.  Delving into Macro Placement with Reinforcement Learning , 2021, 2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD).

[4]  Jianfeng An,et al.  ERDSE: efficient reinforcement learning based design space exploration method for CNN accelerator on resource limited platform , 2021, Graph. Vis. Comput..

[5]  Marian Verhelst,et al.  ZigZag: Enlarging Joint Architecture-Mapping Design Space Exploration for DNN Accelerators , 2021, IEEE Transactions on Computers.

[6]  Yuan Xie,et al.  IRONMAN: GNN-assisted Design Space Exploration in High-Level Synthesis via Reinforcement Learning , 2021, ACM Great Lakes Symposium on VLSI.

[7]  Rui Li,et al.  Analytical characterization and design space exploration for optimization of CNNs , 2021, ASPLOS.

[8]  Celestine Mendler-Dünner,et al.  Revisiting Design Choices in Proximal Policy Optimization , 2020, ArXiv.

[9]  Wen-mei W. Hwu,et al.  DNNExplorer: A Framework for Modeling and Exploring a Novel Paradigm of FPGA-based DNN Accelerator , 2020, 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD).

[10]  Quoc V. Le,et al.  Chip Placement with Deep Reinforcement Learning , 2020, ArXiv.

[11]  Jinjun Xiong,et al.  FPGA/DNN Co-Design: An Efficient Design Methodology for 1oT Intelligence on the Edge , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[12]  Kunle Olukotun,et al.  Practical Design Space Exploration , 2018, 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).

[13]  Lu Zhang,et al.  Extreme Datacenter Specialization for Planet-Scale Computing: ASIC Clouds , 2018, OPSR.

[14]  Yi Liu,et al.  An Efficient Bandit Algorithm for Realtime Multivariate Optimization , 2017, KDD.

[15]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[16]  Gu-Yeon Wei,et al.  A case for efficient accelerator design space exploration via Bayesian optimization , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[17]  Peng Zhang,et al.  Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Yu Cao,et al.  Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks , 2017, FPGA.

[20]  Gu-Yeon Wei,et al.  Co-designing accelerators and SoC interfaces using gem5-Aladdin , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[21]  L. V. Gutierrez,et al.  ASIC Clouds: Specializing the Datacenter , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Shoaib Kamil,et al.  OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[24]  Samy Bengio,et al.  Taking on the curse of dimensionality in joint distributions using neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..