Application-Specific Memory Interleaving Enables High Performance in FPGA-based Grid Computations

Current generations of FPGAs create possibilities for innovative, application-specific computation pipelines. In many cases, the pipeline can fully exploit the FPGA's parallelism only when multiple operands are available concurrently, requiring clusters of values to be fetched from memory. These clusters of values often have fixed organization, as in the eight grid points around an off-grid position that are needed for 3D interpolation of a value at that position. This paper presents a technique for creating custom interleaving of the FPGA's on-chip memories, giving access to the entire cluster of values in one memory cycle. This technique works on grids of 2, 3, or more dimensions, on many non-rectangular grids, and on cluster organization specific to each application. The authors report the initial version of a design tool that inputs the relative positions of grid points in the access cluster, and produces synthesizable HDL code for the custom-interleaved memory

[1]  Richard M. Brown,et al.  The ILLIAC IV Computer , 1968, IEEE Transactions on Computers.

[2]  Maya Gokhale,et al.  Automatic allocation of arrays to memories in FPGA processors with multiple memory banks , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).