A transport-layer network for distributed FPGA platforms

We present a transport-layer network that aids developers in building safe, high-performance distributed FPGA applications. Two essential features of such a network are virtual channels and end-to-end flow control. Our network implements these features, taking advantage of the low error characteristic of a rack level FPGA network to implement a low overhead credit based end-to-end flow control. Our design has many parameters in the source code which can be set at the time of FPGA synthesis, to provide flexibility in setting buffer size and flow control credits to make best use of scarce on-chip memory resources and match the traffic pattern of a virtual channel. Our prototype cluster, which is composed of 20 Xilinx VC707 boards, each with 4 20Gb/s serial links, achieves effective bandwidth of 85% of the maximum physical bandwidth, and a latency of 0.5us per hop. User feedback suggest that these features make distributed application development significantly easier.

[1]  Stephen Booth,et al.  Maxwell - a 64 FPGA Supercomputer , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).

[2]  Simon W. Moore,et al.  Interconnect for commodity FPGA clusters: Standardized or customized? , 2014, FPL 2014.

[3]  Arvind,et al.  Leveraging latency-insensitivity to ease multiple FPGA design , 2012, FPGA '12.

[4]  Andrew W. Moore,et al.  Interconnect for commodity FPGA clusters: Standardized or customized? , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[5]  W. Luk,et al.  Axel: a heterogeneous cluster with FPGAs and GPUs , 2010, FPGA '10.

[6]  Steven Swanson,et al.  Latency-Optimized Networks for Clustering FPGAs , 2013, 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines.

[7]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[8]  Ling Liu,et al.  Achieving 10Gbps Line-rate Key-value Stores with FPGAs , 2013, HotCloud.

[9]  Ming Liu,et al.  Scalable multi-access flash store for big data analytics , 2014, FPGA.

[10]  Simon W. Moore,et al.  Bluehive - A field-programable custom computing machine for extreme-scale real-time neural network simulation , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[11]  Sungjin Lee,et al.  BlueDBM: An appliance for Big Data analytics , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).