Techniques for improved placement-coupled logic replication

Several recent papers have utilized logic replication driven by placement-level timing analysis for improving clock period (e.g., [1], [8], [18], and [2]). All of these papers demonstrated, through various optimization strategies, the potential of the basic technique of replication. In this paper we propose a number of techniques aimed at more fully realizing this potential within the framework employed in [8]. As reported in [7], there are situations in which the approach of [8] fails to yield significant additional improvement due largely to the effects of reconvergence in the netlist. We suggest the use of rectilinear Steiner arborescence embedding as a tool for overcoming this limitation. We also propose techniques for fanout partitioning and cell relocation which are cognizant of both wirelength and timing impact for improved solution quality. We report the effect of other techniques including new replication cost computation, lower-bounding of achievable clock period, and wirelength estimation. We have implemented and experimented with these techniques in FPGA domain. In many cases we were able to approach a fixed flip-flop lower-bound on achievable clock period. Promising experimental results are reported with average 17.4% (up to 39.9%) delay reduction compared with the timing-driven placement from VPR[16] and average 9.3% (up to 37.2%) reduction compared with the basic fanin tree embedder from [8].

[1]  Stephen D. Brown,et al.  Using logic duplication to improve performance in FPGAs , 2003, FPGA '03.

[2]  Frank K. Hwang,et al.  The rectilinear steiner arborescence problem , 2005, Algorithmica.

[3]  Chung-Kuan Cheng,et al.  A replication cut for two-way partitioning , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[4]  Robert K. Brayton,et al.  Delay-optimal technology mapping by DAG covering , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[5]  Sung-Woo Hur,et al.  Timing driven maze routing , 1999, ISPD '99.

[6]  Jason Cong,et al.  Performance-Driven Interconnect Design Based on Distributed RC Delay Model , 1993, 30th ACM/IEEE Design Automation Conference.

[7]  John Lillis,et al.  An Approach to Placement-Coupled Logic Replication , 2006, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[8]  Chung-Kuan Cheng,et al.  Algorithms for optimal introduction of redundant logic for timing and area optimization , 1996, 1996 IEEE International Symposium on Circuits and Systems. Circuits and Systems Connecting the World. ISCAS 96.

[9]  Vaughn Betz,et al.  Timing-driven placement for FPGAs , 2000, FPGA '00.

[10]  Jason Cong,et al.  Simultaneous timing-driven placement and duplication , 2005, FPGA '05.

[11]  Robert K. Brayton,et al.  Wireplanning in logic synthesis , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).

[12]  A. Kahng,et al.  On optimal interconnections for VLSI , 1994 .

[13]  John Lillis,et al.  Addressing the Effects of Reconvergence on Placement-Coupled Logic Replication , 2022 .

[14]  J. Lillis,et al.  S-Tree: a technique for buffered routing tree synthesis , 2002, Proceedings 2002 Design Automation Conference (IEEE Cat. No.02CH37324).

[15]  Alberto L. Sangiovanni-Vincentelli,et al.  Addressing the timing closure problem by integrating logic optimization and placement , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[16]  Ankur Srivastava,et al.  Timing driven gate duplication: complexity issues and algorithms , 2000, IEEE/ACM International Conference on Computer Aided Design. ICCAD - 2000. IEEE/ACM Digest of Technical Papers (Cat. No.00CH37140).

[17]  Kurt Keutzer DAGON: Technology Binding and Local Optimization by DAG Matching , 1987, DAC.

[18]  Martin D. F. Wong,et al.  Minimum replication min-cut partitioning , 1996, Proceedings of International Conference on Computer Aided Design.

[19]  Ankur Srivastava,et al.  Timing driven gate duplication , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[20]  John Lillis,et al.  Timing optimization of FPGA placements by logic replication , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).