Optimizing streaming stencil time-step designs via FPGA floorplanning

Stencil computations represent a highly recurrent class of algorithms in various high performance computing scenarios. The Streaming Stencil Time-step (SST) architecture is a recent implementation of stencil computations on Field Programmable Gate Array (FPGA). In this paper, we propose an automated framework for SST-based architectures capable of achieving the maximum performance level for a given FPGA device through 1) the maximization of basic modules instantiated in the design and 2) optimization of the design floorplanning. Experimental results show that the proposed approach reduces the design time up to 15× w.r.t. naive design space exploration approaches, and improves the performance of the 13%.

[1]  Marco D. Santambrogio,et al.  A polyhedral model-based framework for dataflow implementation on FPGA devices of Iterative Stencil Loops , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[2]  Marco D. Santambrogio,et al.  Floorplanning for Partially-Reconfigurable FPGA Systems via Mixed-Integer Linear Programming , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.

[3]  Kizheppatt Vipin,et al.  Architecture-Aware Reconfiguration-Centric Floorplanning for Partial Reconfiguration , 2012, ARC.

[4]  Satoru Yamamoto,et al.  Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth , 2011, 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines.

[5]  P. Sadayappan,et al.  High-performance code generation for stencil computations on GPU architectures , 2012, ICS '12.

[6]  Paul Feautrier,et al.  Polyhedron Model , 2011, Encyclopedia of Parallel Computing.

[7]  L. Perelman,et al.  Hydrostatic, quasi‐hydrostatic, and nonhydrostatic ocean modeling , 1997 .

[8]  Akash Kumar,et al.  PRFloor: An Automatic Floorplanner for Partially Reconfigurable FPGA Systems , 2016, FPGA.

[9]  A. Nakano,et al.  Multiresolution molecular dynamics algorithm for realistic materials modeling on parallel computers , 1994 .

[10]  Uday Bondhugula,et al.  Tiling stencil computations to maximize parallelism , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[11]  L. Perelman,et al.  A finite-volume, incompressible Navier Stokes model for studies of the ocean on parallel computers , 1997 .

[12]  David Atienza,et al.  A high-level synthesis flow for the implementation of iterative stencil loop algorithms on FPGA devices , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).