Autonomous Application of Netlist Transformations Inside Lagrangian Relaxation-Based Optimization

Timing closure is a complex process that involves many iterative optimization steps applied in various phases of the physical design flow. Lagrangian Relaxation (LR)-based optimization has been established as a viable approach for this. We extend LR-based optimization by interleaving in each iteration various techniques, such as: gate and flip-flop sizing; buffering to fix late and early timing violations; pin swapping; gate merge/split transformations; and useful clock skew. In all cases, locally optimal decisions are made using LR-based cost functions. In each iteration of LR-based optimization, we leverage the Multi-Armed Bandit (MAB) model to automatically pick which optimization heuristic should be applied to the design. The goal is to improve the performance metrics based on the rewards learned from the previous applications of each heuristic and the runtime cost paid for the received reward. The fine-grained combination of an LR-based optimization flow with a statistical recommendation system allows for the autonomous execution of the optimization flow and results in significant quality-of-results improvement relative to the state-of-the-art. More specifically, our flow achieves 17% lower clock period, while also saving 15% power and 6% area, on average, on the TAU2019 benchmarks, as compared to the TAU2019 contest winner, and 25% better leakage power on the ISPD13 benchmarks, as compared to the best reported results.

[1]  Sarvesh Bhardwaj,et al.  Fast Lagrangian Relaxation-Based Multithreaded Gate Sizing Using Simple Timing Calibrations , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Guilherme Flach,et al.  Effective Method for Simultaneous Gate Sizing and $V$ th Assignment Using Lagrangian Relaxation , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Shoaib Kamil,et al.  OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[4]  L.P.P.P. van Ginneken,et al.  Buffer placement in distributed RC-tree networks for minimal Elmore delay , 1990 .

[5]  Derong Liu,et al.  OSFA: A New Paradigm of Aging Aware Gate-Sizing for Power/Performance Optimizations Under Multiple Operating Conditions , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[6]  Optimal wire sizing and buffer insertion for low power and a generalized delay model , 1995, ICCAD.

[7]  Chrysostomos Nicopoulos,et al.  Multi-Armed Bandits for Autonomous Timing-driven Design Optimization , 2019, 2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS).

[8]  Gregory Shklover,et al.  Simultaneous clock and data gate sizing algorithm with common global objective , 2012, ISPD '12.

[9]  Shiyan Hu,et al.  A fully polynomial time approximation scheme for timing driven minimum cost buffer insertion , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[10]  Chrysostomos Nicopoulos,et al.  Design Optimization by Fine-grained Interleaving of Local Netlist Transformations in Lagrangian Relaxation , 2020, ISPD.

[11]  Martin D. F. Wong,et al.  Fast and exact simultaneous gate and wire sizing by Lagrangian relaxation , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).

[12]  Hiran Tennakoon,et al.  Nonconvex Gate Delay Modeling and Delay Optimization , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  David G. Chinnery,et al.  Minimization of dynamic and static power through joint assignment of threshold voltages and sizing optimization , 2003, ISLPED '03.

[14]  John P. Fishburn,et al.  TILOS: A posynomial programming approach to transistor sizing , 2003, ICCAD 2003.

[15]  Sachin S. Sapatnekar,et al.  Interleaving buffer insertion and transistor sizing into a single optimization , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[16]  Sachin S. Sapatnekar,et al.  Convex delay models for transistor sizing , 2000, DAC.

[17]  Martin D. F. Wong,et al.  OpenTimer: A high-performance timing analysis tool , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[18]  Steven M. Burns,et al.  Algorithms for Gate Sizing and Device Parameter Selection for High-Performance Designs , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[19]  Luciano Lavagno,et al.  Electronic Design Automation for IC Implementation, Circuit Design, and Process Technology , 2016 .

[20]  Sarvesh Bhardwaj,et al.  On timing closure: Buffer insertion for hold-violation removal , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[21]  Mateus Fogaça,et al.  Rsyn: An Extensible Physical Synthesis Framework , 2017, ISPD.

[22]  Stephan Held,et al.  Gate sizing for large cell-based designs , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[23]  David G. Chinnery,et al.  Rapid gate sizing with fewer iterations of Lagrangian Relaxation , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[24]  Daijoon Hyun,et al.  Buffer insertion to remove hold violations at multiple process corners , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[25]  Zhiru Zhang,et al.  A Parallel Bandit-Based Approach for Autotuning FPGA Compilation , 2017, FPGA.

[26]  Shih-Hsu Huang,et al.  Minimum buffer insertions for clock period minimization , 2010, 2010 International Symposium on Computer, Communication, Control and Automation (3CA).

[27]  Zheng Wang,et al.  Machine Learning in Compiler Optimization , 2018, Proceedings of the IEEE.

[28]  Shiyan Hu,et al.  Gate Sizing for Cell-Library-Based Designs , 2007, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[29]  Rajeev Jain,et al.  Efficient reinforcement learning for automating human decision-making in SoC design , 2018, DAC.

[30]  Andrew B. Kahng,et al.  Enhancing sensitivity-based power reduction for an industry IC design context , 2019, Integr..

[31]  Tiago Reimann,et al.  Cell Selection for High-Performance Designs in an Industrial Design Flow , 2016, ISPD.

[32]  Mohamed Shalan,et al.  DRiLLS: Deep Reinforcement Learning for Logic Synthesis , 2019, 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC).

[33]  Steven M. Burns,et al.  An improved benchmark suite for the ISPD-2013 discrete cell sizing contest , 2013, ISPD '13.

[34]  Hai Zhou,et al.  An efficient buffer insertion algorithm for large networks based on Lagrangian relaxation , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[35]  Stephan Held,et al.  Provably Fast and Near-Optimum Gate Sizing , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[36]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[37]  Sachin S. Sapatnekar,et al.  Clock Skew Optimization , 1999 .

[38]  Chrysostomos Nicopoulos,et al.  Timing-Driven Placement Optimization Facilitated by Timing-Compatibility Flip-Flop Clustering , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[39]  Shih-Hsu Huang,et al.  Low-power timing closure methodology for ultra-low voltage designs , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[40]  David G. Chinnery,et al.  Linear programming for sizing, Vth and Vdd assignment , 2005, ISLPED '05.

[41]  Wai-Kei Mak,et al.  Power and density-aware buffer insertion , 2008, 2008 IEEE International Symposium on VLSI Design, Automation and Test (VLSI-DAT).

[42]  David G. Chinnery,et al.  Lagrangian Relaxation Based Gate Sizing With Clock Skew Scheduling - A Fast and Effective Approach , 2019, ISPD.

[43]  Olivier Coudert,et al.  Gate sizing for constrained delay/power/area optimization , 1997, IEEE Trans. Very Large Scale Integr. Syst..

[44]  Mingyan Yu,et al.  A distinctive O(mn) time algorithm for optimal buffer insertions , 2015, Sixteenth International Symposium on Quality Electronic Design.