Behavior-Level Observability Analysis for Operation Gating in Low-Power Behavioral Synthesis

Many techniques for power reduction in advanced RTL synthesis tools rely explicitly or implicitly on observability don’t-care conditions. In this article we propose a systematic approach to maximize the effectiveness of these techniques by generating power-friendly RTL descriptions in behavioral synthesis. This is done using operation gating, that is, explicitly adding a predicate to an operation based on its observability condition, so that the operation, once identified as unobservable at runtime, can be avoided using RTL power optimization techniques such as clock gating. We first introduce the concept of behavior-level observability and its approximations in the context of behavioral synthesis. We then propose an efficient procedure to compute an approximated behavior-level observability of every operation in a dataflow graph. Unlike previous techniques which work at the bit level in Boolean networks, our method is able to perform analysis at the word level, and thus avoids most computation effort with a reasonable approximation. Our algorithm exploits the observability-masking nature of some Boolean operations, as well as the select operation, and allows certain forms of other knowledge to be considered for stronger observability conditions. The approximation is proved exact for (acyclic) dataflow graphs when non-Boolean operations other than select are treated as black boxes. The behavior-level observability condition obtained by our analysis can be used to guide the operation scheduler to optimize the efficiency of operation gating. In a set of experiments on real-world designs, our method achieves an average of 33.9% reduction in total power; it outperforms a previous method by 17.1% on average and gives close-to-optimal solutions on several designs. To the best of our knowledge, this is the first time behavior-level observability analysis and optimization are performed during behavioral synthesis in a systematic manner. We believe that our idea can be applied to compiler transformations in general.

[1]  Luca Benini,et al.  A scalable algorithm for RTL insertion of gated clocks based on ODCs computation , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Kurt Keutzer,et al.  Logic Synthesis , 1994 .

[3]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[4]  Wei Jiang,et al.  Platform-Based Resource Binding Using a Distributed Register-File Microarchitecture , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[5]  Norbert Wehn,et al.  Automating RT-level operand isolation to minimize power consumption in datapaths , 2000, DATE '00.

[6]  Gila Kamhi,et al.  A new paradigm for synthesis and propagation of clock gating conditions , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[7]  Takashi Kambe,et al.  A method of redundant clocking detection and power reduction at RT level design , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[8]  Jason Cong,et al.  Behavior-level observability don't-cares and application to low-power behavioral synthesis , 2009, ISLPED.

[9]  Jason Cong,et al.  An efficient and versatile scheduling algorithm based on SDC formulation , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[10]  Jason Cong,et al.  Scheduling with soft constraints , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[11]  Majid Sarrafzadeh,et al.  Power-manageable scheduling technique for control dominated high-level synthesis , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[12]  Martyn Edwards,et al.  Logic synthesis , 1994, Microprocessors and microsystems.

[13]  Philippe Coussy,et al.  High-Level Synthesis: from Algorithm to Digital Circuit , 2008 .

[14]  Qi Wang,et al.  RTL Power Optimization with Gate-Level Accuracy , 2003, ICCAD 2003.

[15]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[16]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[17]  Scott A. Mahlke,et al.  The program decision logic approach to predicated execution , 1999, ISCA.

[18]  John Forrest,et al.  CBC User Guide , 2005 .

[19]  Bruce D. Shriver,et al.  Local Microcode Compaction Techniques , 1980, CSUR.

[20]  Luca Benini,et al.  Automatic synthesis of low-power gated-clock finite-state machines , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[21]  Jason Cong,et al.  AutoPilot: A Platform-Based ESL Synthesis System , 2008 .

[22]  Luca Benini,et al.  Symbolic synthesis of clock-gating logic for power optimization of synchronous controllers , 1999, TODE.

[23]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[24]  Alexandru Nicolau,et al.  Short-Circuit Compiler Transformation: Optimizing Conditional Blocks , 2007, 2007 Asia and South Pacific Design Automation Conference.

[25]  José C. Monteiro,et al.  Scheduling techniques to enable power management , 1996, DAC '96.

[26]  Scott Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.

[27]  Rajiv Gupta,et al.  Partial dead code elimination using slicing transformations , 1997, PLDI '97.

[28]  Hai Zhou,et al.  Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems , 2010 .