The effect of LUT and cluster size on deep-submicron FPGA performance and density

In this paper, we revisit the field-programmable gate-array (FPGA) architectural issue of the effect of logic block functionality on FPGA performance and density. In particular, in the context of lookup table, cluster-based island-style FPGAs (Betz et al. 1997) we look at the effect of lookup table (LUT) size and cluster size (number of LUTs per cluster) on the speed and logic density of an FPGA. We use a fully timing-driven experimental flow (Betz et al. 1997), (Marquardt, 1999) in which a set of benchmark circuits are synthesized into different cluster-based (Betz and Rose, 1997, 1998) and (Marquardt, 1999) logic block architectures, which contain groups of LUTs and flip-flops. Across all architectures with LUT sizes in the range of 2 to 7 inputs, and cluster size from 1 to 10 LUTs, we have experimentally determined the relationship between the number of inputs required for a cluster as a function of the LUT size (K) and cluster size (N). Second, contrary to previous results, we have shown that clustering small LUTs (sizes 2 and 3) produces better area results than what was presented in the past. However, our results also show that the performance of FPGAs with these small LUT sizes is significantly worse (by almost a factor of 2) than larger LUTs. Hence, as measured by area-delay product, or by performance, these would be a bad choice. Also, we have discovered that LUT sizes of 5 and 6 produce much better area results than were previously believed. Finally, our results show that a LUT size of 4 to 6 and cluster size of between 3-10 provides the best area-delay product for an FPGA.

[1]  Jonathan Rose,et al.  CALL FOR ARTICLES IEEE Design & Test of Computers Special Issue on Microprocessors , 1996 .

[2]  André DeHon,et al.  Hardware-assisted simulated annealing with application for fast FPGA placement , 2003, FPGA '03.

[3]  A. El Gamal,et al.  PLA-based FPGA Area Versus Cell C+ Granularity , 1992, 1992 Proceedings of the IEEE Custom Integrated Circuits Conference.

[4]  Vaughn Betz,et al.  Using cluster-based logic blocks and timing-driven packing to improve FPGA speed and density , 1999, FPGA '99.

[5]  Jonathan Rose,et al.  The effect of logic block complexity on area of programmable gate arrays , 1989, 1989 Proceedings of the IEEE Custom Integrated Circuits Conference.

[6]  Pierre Marchal,et al.  Field-programmable gate arrays , 1999, CACM.

[7]  Jonathan Rose,et al.  The effect of LUT and cluster size on deep-submicron FPGA performance and density , 2004 .

[8]  Kevin Charles Kenton Chung Architecture and Synthesis of Field-Programmable Gate Arrays with Hard-wired Connections , 1994 .

[9]  Vaughn Betz,et al.  How Much Logic Should Go in an FPGA Logic Block? , 1998, IEEE Des. Test Comput..

[10]  SinghAmit,et al.  Efficient circuit clustering for area and power reduction in FPGAs , 2002 .

[11]  Steven J. E. Wilton,et al.  On the sensitivity of FPGA architectural conclusions to experimental assumptions, tools, and techniques , 2002, FPGA '02.

[12]  Sinan Kaptanoglu,et al.  A new high density and very low cost reprogrammable FPGA architecture , 1999, FPGA '99.

[13]  A. El Gamal,et al.  FPGA performance versus cell granularity , 1991, Proceedings of the IEEE 1991 Custom Integrated Circuits Conference.

[14]  Kamran Eshraghian,et al.  Principles of CMOS VLSI Design: A Systems Perspective , 1985 .

[15]  Stephen D. Brown,et al.  The effect of switch box flexibility on routability of field programmable gate arrays , 1990, IEEE Proceedings of the Custom Integrated Circuits Conference.

[16]  Jason Cong,et al.  Architecture evaluation for power-efficient FPGAs , 2003, FPGA '03.

[17]  Jonathan Rose,et al.  Architecture of field-programmable gate arrays: the effect of logic block functionality on area efficiency , 1990 .

[18]  Alexander R. Marquardt,et al.  Cluster-Based Architecture, Timing-Driven Packing and Timing-Driven Placement for FPGAs , 1999 .

[19]  Vaughn Betz,et al.  Cluster-based logic blocks for FPGAs: area-efficiency vs. input sharing and size , 1997, Proceedings of CICC 97 - Custom Integrated Circuits Conference.

[20]  Jonathan Rose,et al.  The effect of logic block architecture on FPGA performance , 1992 .

[21]  Jason Cong,et al.  FlowMap: an optimal technology mapping algorithm for delay optimization in lookup-table based FPGA designs , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[22]  Kenneth C. Smith,et al.  Microelectronic circuits, 2nd ed. , 1987 .

[23]  Vaughn Betz,et al.  The stratixπ routing and logic architecture , 2003, FPGA '03.

[24]  Elias Ahmed,et al.  THE EFFECT OF LOGIC BLOCK GRANULARITY ON DEEP-SUBMICRON FPGA PERFORMANCE AND DENSITY , 2001 .

[25]  David Lewis,et al.  Using Sparse Crossbars within LUT Clusters , 2001 .

[26]  STEPHEN BROWN,et al.  Minimizing FPGA Interconnect Delays , 1996, IEEE Des. Test Comput..

[27]  Vaughn Betz,et al.  Directional bias and non-uniformity in FPGA global routing architectures , 1996, ICCAD 1996.

[28]  Jason Cong,et al.  Boolean matching for complex PLBs in LUT-based FPGAs with application to architecture evaluation , 1998, FPGA '98.

[29]  Stephen D. Brown,et al.  Flexibility of interconnection structures for field-programmable gate arrays , 1991 .

[30]  Bai Nguyen,et al.  An Innovative, Segmented High Performance FPGA Family with Variable-Grain-Architecture and Wide-Gating Functions , 1999, FPGA.

[31]  Vaughn Betz,et al.  Architecture and CAD for Deep-Submicron FPGAS , 1999, The Springer International Series in Engineering and Computer Science.