An efficent clustering algorithm for low power clock tree synthesis

Clocks are known to be major source of power consumption in digital circuits, especially in high performance microprocessors. With the technology scaling, the increasingly capacitive interconnects contribute to more than 40% of the local clock power. In this paper, we propose a clustering algorithm for them inimization of the power in local clock tree, which is shown to be equivalent to the minimization of interconnect capacitance in the tree. Given a set of sequentials and their locations, clustering is performed to determine the clockbuffers that are required to synchronize the sequentials, where a cluster implies that a clock buffer drives all the sequentials in the cluster. The clustering algorithm uses minimum spanning tree (MST) metric to estimate the interconnect capacitance and ensures the optimality of the solution, when no capacity constraints are applied. The buffers are then sized and clock nets arerouted to minimize the delay, slope, and skew constraints. We compare the clocktrees obtained by our clustering and the competitive approaches on several blocks from a microprocessor design in 65nm technology. The comparison shows that our algorithm improves the clock tree capacitance consistently by up to 21%.

[1]  Masato Edahiro,et al.  A Clustering-Based Optimization Algorithm in Zero-Skew Routings , 1993, 30th ACM/IEEE Design Automation Conference.

[2]  H. Fair,et al.  Clocking design and analysis for a 600 MHz Alpha microprocessor , 1998, 1998 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, ISSCC. First Edition (Cat. No.98CH36156).

[3]  Madhav P. Desai,et al.  Sizing of clock distribution networks for high performance CPU chips , 1996, DAC '96.

[4]  Uri C. Weiser,et al.  Interconnect-power dissipation in a microprocessor , 2004, SLIP '04.

[5]  Shekhar Borkar,et al.  Obeying Moore's law beyond 0.18 micron [microprocessor design] , 2000, Proceedings of 13th Annual IEEE International ASIC/SOC Conference (Cat. No.00TH8541).

[6]  Noel Menezes,et al.  Repeater scaling and its impact on CAD , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  James Tschanz,et al.  Total power optimization by simultaneous dual-Vt allocation and device sizing in high performance microprocessors , 2002, DAC '02.

[8]  Andrew B. Kahng,et al.  Zero-skew clock routing trees with minimum wirelength , 1992, [1992] Proceedings. Fifth Annual IEEE International ASIC Conference and Exhibit.

[9]  Maurice Queyranne,et al.  An Introduction to Submodular Functions and Optimization , 2002 .

[10]  Jiang Hu,et al.  Reducing clock skew variability via cross links , 2004, Proceedings. 41st Design Automation Conference, 2004..

[11]  Nasser A. Kurd,et al.  A multigigahertz clocking scheme for the Pentium(R) 4 microprocessor , 2001, IEEE J. Solid State Circuits.

[12]  Sachin S. Sapatnekar,et al.  Hybrid structured clock network construction , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[13]  P. Kapur,et al.  Technology and reliability constrained future copper interconnects. I. Resistance modeling , 2002 .

[14]  Lawrence T. Pileggi,et al.  Clustering and load balancing for buffered clock tree synthesis , 1997, Proceedings International Conference on Computer Design VLSI in Computers and Processors.

[15]  Albert E. Ruehli,et al.  Multi-GHz interconnect effects in microprocessors , 2001, ISPD '01.

[16]  Malgorzata Marek-Sadowska Ashok Vittal Power Optimal Buffered Clock Tree Design , 1995, 32nd Design Automation Conference.

[17]  Ren-Song Tsay,et al.  An exact zero-skew clock routing algorithm , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[18]  Masato Edahiro,et al.  An Efficient Zero-Skew Routing Algorithm , 1994, 31st Design Automation Conference.

[19]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[20]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[21]  Jeng-Liang Tsai,et al.  Statistical timing analysis driven post-silicon-tunable clock-tree synthesis , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..