Direct GPU/FPGA communication Via PCI express

We describe a mechanism for connecting GPU and FPGA devices directly via the PCI Express bus, enabling the transfer of data between these heterogeneous computing units without the intermediate use of system memory. We evaluate the performance benefits of this approach over a range of transfer sizes, and demonstrate its utility in a computer vision application. We find that bypassing system memory yields improvements as high as 2.2× in data transfer speed, and 1.9× in application performance.

[1]  G. Broll,et al.  Microsoft Corporation , 1999 .

[2]  John Ayer,et al.  Understanding Performance of PCI Express Systems , 2008 .

[3]  Ken Eguro,et al.  Random decision tree body part recognition using FPGAs , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[4]  Ray Bittner Speedy bus mastering PCI express , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[5]  Massimo Bernaschi,et al.  GPU Peer-to-Peer Techniques Applied to a Cluster Interconnect , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[6]  Turner Whitted,et al.  Embedded function composition , 2009, High Performance Graphics.

[7]  Bruno da Silva,et al.  Performance and toolchain of a combined GPU/FPGA desktop (abstract only) , 2013, FPGA '13.

[8]  J. Xu OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems , 2009 .