A Fast and Near-Optimal Clustering Algorithm for Low-Power Clock Tree Synthesis

Clocks are known to be major source of power consumption in digital circuits. In this paper, we propose a clustering algorithm for the minimization of power in a local clock tree. Given a set of sequentials and their locations, clustering is performed to determine the clock buffers that are required to synchronize the sequentials, where a cluster implies that a clock buffer drives all the sequentials in the cluster. The results produced by the algorithm are often within 1.3 × of the lower bound and have 32% lower costs, on average, than those due to an approximation algorithm with 2.5 × faster runtimes. Compared to competitive heuristic from a vendor tool, the results due to the algorithm on several blocks in microprocessor designs in advanced nanometer technologies show 14% reduction, on average, in clock tree power while meeting skew or slew constraints. The algorithm has been employed for clock tree synthesis for several microprocessor designs across process generations due to consistently significant clock tree power savings over the results due to competitive alternatives.

[1]  Yao-Wen Chang,et al.  PRICE: Power reduction by placement and clock-network co-synthesis for pulsed-latch designs , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[2]  Dongjin Lee,et al.  Obstacle-Aware Clock-Tree Shaping During Placement , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Yu-Ming Yang,et al.  INTEGRA: Fast Multibit Flip-Flop Clustering for Clock Power Saving , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Lawrence T. Pileggi,et al.  Clustering and load balancing for buffered clock tree synthesis , 1997, Proceedings International Conference on Computer Design VLSI in Computers and Processors.

[5]  Cheng-Kok Koh,et al.  Cross link insertion for improving tolerance to variations in clock network synthesis , 2011, ISPD '11.

[6]  Ren-Song Tsay,et al.  An exact zero-skew clock routing algorithm , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[7]  Jeng-Liang Tsai,et al.  Statistical timing analysis driven post-silicon-tunable clock-tree synthesis , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[8]  Jiang Hu,et al.  Reducing clock skew variability via cross links , 2004, Proceedings. 41st Design Automation Conference, 2004..

[9]  Rupesh S. Shelar An efficent clustering algorithm for low power clock tree synthesis , 2007, ISPD '07.

[10]  Malgorzata Marek-Sadowska Ashok Vittal Power Optimal Buffered Clock Tree Design , 1995, 32nd Design Automation Conference.

[11]  Nasser A. Kurd,et al.  A multigigahertz clocking scheme for the Pentium(R) 4 microprocessor , 2001, IEEE J. Solid State Circuits.

[12]  Jens Vygen,et al.  Approximation algorithms for a facility location problem with service capacities , 2008, TALG.

[13]  Masato Edahiro,et al.  A Clustering-Based Optimization Algorithm in Zero-Skew Routings , 1993, 30th ACM/IEEE Design Automation Conference.

[14]  Wai-Kei Mak,et al.  ISPD11: Power-Driven Flip-Flop Merging and Relocation , 2012, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[15]  Tsung-Yi Ho,et al.  Pulsed-latch-based clock tree migration for dynamic power reduction , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[16]  Dongjin Lee,et al.  Multilevel tree fusion for robust clock networks , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).