System-level FPGA device driver with high-level synthesis support

We can exploit the standardization of communication abstractions provided by modern high-level synthesis tools like Vivado HLS, Bluespec and SCORE to provide stable system interfaces between the host and PCIe-based FPGA accelerator platforms. At a high level, our FPGA driver attempts to provide CUDA-like driver behavior, and more, to FPGA programmers. On the FPGA fabric, we develop an AXI-compliant, lightweight interface switch coupled to multiple physical interfaces (PCIe, Ethernet, DRAM) to provide programmable, portable routing capability between the host and user logic on the FPGA. On the host, we adapt the RIFFA 1.0 driver to provide enhanced communication APIs along with bitstream configuration capability allowing low-latency, high-throughput communication and safe, reliable programming of user logic on the FPGA. Our driver only consumes 21% BRAMs and 14% logic overhead on a Xilinx ML605 platform or 9% BRAMs and 8% logic overhead on a Xilinx V707 board. We are able to sustain DMA transfer throughput (to DRAM) of 1.47GB/s (74% peak) of the PCIe (x4 Gen2) bandwidth, 120.2MB/s (96%) of the Ethernet (1G) bandwidth and 5.93GB/s (92.5%) of DRAM bandwidth.

[1]  A. Parashar,et al.  LEAP : A Virtual Platform Architecture for FPGAs , 2010 .

[2]  Alan D. George,et al.  VirtualRC: a virtual FPGA platform for applications and tools portability , 2012, FPGA '12.

[3]  Ken Eguro,et al.  SIRC: An Extensible Reconfigurable Computing Communication API , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[4]  Stylianos Perissakis,et al.  Stream computations organized for reconfigurable execution , 2006, Microprocess. Microsystems.

[5]  Kizheppatt Vipin,et al.  Architecture-Aware Reconfiguration-Centric Floorplanning for Partial Reconfiguration , 2012, ARC.

[6]  Perry Cheng,et al.  The Liquid Metal IP bridge , 2013, 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC).

[7]  Yoav Freund,et al.  RIFFA: A Reusable Integration Framework for FPGA Accelerators , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[8]  Dariusz Makowski,et al.  PCI Express Hot-Plug mechanism in Linux-based ATCA control systems , 2010, Proceedings of the 17th International Conference Mixed Design of Integrated Circuits and Systems - MIXDES 2010.

[9]  Dzung T. Hoang,et al.  The Splash 2 processor and applications , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[10]  Arvind,et al.  Automatic generation of hardware/software interfaces , 2012, ASPLOS XVII.

[11]  Ryan Kastner,et al.  RIFFA 2.0: A reusable integration framework for FPGA accelerators , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[12]  Gaetano Borriello,et al.  Automatic Synthesis of Device Drivers for Hardware/Software Co-design , 1993 .

[13]  Nachiket Kapre,et al.  Packet Switched vs. Time Multiplexed FPGA Overlay Networks , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[14]  John Wawrzynek,et al.  BEE2: a high-end reconfigurable computing system , 2005, IEEE Design & Test of Computers.