Coarse-grained carry architecture for FPGA (poster abstract)

The fine grain size of current FPGA has been a major performance bottleneck. In this paper, we introduce a coarse-grained carry architecture that increases the grain size from a two-bit addition/subtraction per logic block to an m-bit addition/subtraction. The m-bit addition is implemented by increasing the number of read-ports for a look-up table from 1 to m-2. In addition, we use a dedicated selection logic to implement an m-bit conditional addition/subtraction. The proposed architecture improves the performance of applications containing intensive arithmetic operations. We use throughout density as a cost-performance metric to justify the benefit of the new architecture and find the optimal grain size. We could achieve roughly up to 5 times larger throughput density for selected applications at the cost of 5-10% area penalty.