LinkBlaze: Efficient global data movement for FPGAs

FPGA capacity has grown rapidly and emerging applications comprise a large number of compute modules. The communication among these modules and external memory will cause routing congestion in fabric interconnect. This problem is more pronounced with process scaling since the technology is not improving wire resistance. High-speed data compute modules along with faster local storage enable efficient kernels running at 500–800 MHz on the newer devices. FPGA system performance at those high frequencies will require efficient global data movement across the chip from/to an external memory. We propose a system-level solution for global data movement in FPGAs, called LinkBlaze, to address these issues. LinkBlaze leverages both general Network-on-Chip (NoC) techniques and FPGA architecture for reducing resource usage. This work explores different router architectures and provides insights for the user on how to best utilize and share the global links on the FPGA. Our results indicate 640 MHz performance on Ultrascale+ for an optimized 3-port soft NoC. We further extend those results to implement a global data movement overlay operating as high as 1 GHz, by restricting the number of clients and leveraging flexible FPGA placement. Our proposed solution enables 8GB/s system-level throughput in Ultrascale+ for a 64-bit instance, while using underutilized resources in FPGAs. Our results indicate how to scale our solution to implement more than 52GB/s access to external memory using 2–3X less fabric resources. We recommend an architecture and a preferred location for our proposed interconnect overlay and intend to make the solution available for the FPGA application community.

[1]  Idit Keidar,et al.  NoC-Based FPGA: Architecture and Routing , 2007, First International Symposium on Networks-on-Chip (NOCS'07).

[2]  Nachiket Kapre,et al.  Hoplite: Building austere overlay NoCs for FPGAs , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[3]  Simon W. Moore,et al.  Exploring hard and soft networks-on-chip for FPGAs , 2008, 2008 International Conference on Field-Programmable Technology.

[4]  Vaughn Betz,et al.  LYNX: CAD for FPGA-based networks-on-chip , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

[5]  Nachiket Kapre Marathon: Statically-Scheduled Conflict-Free Routing on FPGA Overlay NoCs , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[6]  Onur Mutlu,et al.  A case for bufferless routing in on-chip networks , 2009, ISCA '09.

[7]  James C. Hoe,et al.  CONNECT: re-examining conventional wisdom for designing nocs in the context of FPGAs , 2012, FPGA '12.

[8]  Vaughn Betz,et al.  Design tradeoffs for hard and soft FPGA-based Networks-on-Chip , 2012, 2012 International Conference on Field-Programmable Technology.