GRANNITE: Graph Neural Network Inference for Transferable Power Estimation

This paper introduces GRANNITE, a GPU-accelerated novel graph neural network (GNN) model for fast, accurate, and transferable vector-based average power estimation. During training, GRANNITE learns how to propagate average toggle rates through combinational logic: a netlist is represented as a graph, register states and unit inputs from RTL simulation are used as features, and combinational gate toggle rates are used as labels. A trained GNN model can then infer average toggle rates on a new workload of interest or new netlists from RTL simulation results in a few seconds. Compared to traditional power analysis using gate-level simulations, GRANNITE achieves >18.7X speedup with an error of only <5.5% across a diverse set of benchmark circuits. Compared to a GPU-accelerated conventional probabilistic switching activity estimation approach, GRANNITE achieves much better accuracy (on average 25.9% lower error) at similar runtimes.

[1]  Yici Cai,et al.  Early stage real-time SoC power estimation using RTL instrumentation , 2015, The 20th Asia and South Pacific Design Automation Conference.

[2]  Alexander J. Smola,et al.  Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.

[3]  A. Afzali-Kusha,et al.  A parallel algorithm for power estimation at gate level , 2002, The 2002 45th Midwest Symposium on Circuits and Systems, 2002. MWSCAS-2002..

[4]  Mary Jane Irwin,et al.  Accurate Estimation of Combinational Circuit Activity , 1995, 32nd Design Automation Conference.

[5]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[6]  Valeria Bertacco,et al.  Event-driven gate-level simulation with GP-GPUs , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[7]  Yuan Zhou,et al.  PRIMAL: Power Inference using Machine Learning , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[8]  Yangdong Deng,et al.  Massively Parallel Logic Simulation with GPUs , 2011, TODE.

[9]  Adam M. Izraelevitz,et al.  The Rocket Chip Generator , 2016 .

[10]  Gang Wang,et al.  DAG-Recurrent Neural Networks for Scene Labeling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Hokeun Kim,et al.  Strober: Fast and Accurate Sample-Based Energy Simulation for Arbitrary RTL , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[12]  Hans-Joachim Wunderlich,et al.  High-Throughput Logic Timing Simulation on GPGPUs , 2015, TODE.

[13]  Christopher Torng,et al.  A modular digital VLSI flow for high-productivity SoC design , 2018, DAC.

[14]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[15]  David Z. Pan,et al.  DREAMPIace: Deep Learning Toolkit-Enabled GPU Acceleration for Modern VLSI Placement , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[16]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.