pvFPGA: paravirtualising an FPGA-based hardware accelerator towards general purpose computing

This paper presents an ameliorated design of pvFPGA, which is a novel system design solution for virtualising an FPGA-based hardware accelerator by a virtual machine monitor (VMM). The accelerator design on the FPGA can be used for accelerating various applications, regardless of the application computation latencies. In the implementation, we adopt the Xen VMM to build a paravirtualised environment, and a Xilinx Virtex-6 as an FPGA accelerator. The data transferred between the x86 server and the FPGA accelerator through direct memory access (DMA), and a streaming pipeline technique is adopted to improve the efficiency of data transfer. Several solutions to solve streaming pipeline hazards are discussed in this paper. In addition, we propose a technique, hyper-requesting, which enables portions of two requests bidding to different accelerator applications to be processed on the FPGA accelerator simultaneously through DMA context switches, to achieve request level parallelism. The experimental results show that hyper-requesting reduces request turnaround time by up to 80%.

[1]  Jeremy Sugerman,et al.  GPU virtualization on VMware's hosted I/O architecture , 2008, OPSR.

[2]  Víctor M. Gulías,et al.  GPU-based fast motion estimation for on-the-fly encoding of computer-generated video streams , 2011, NOSSDAV '11.

[3]  Giulio Giunta,et al.  A GPGPU Transparent Virtualization Component for High Performance Computing Clouds , 2010, Euro-Par.

[4]  Paul Chow,et al.  FPGAs in the Cloud: Booting Virtualized Hardware Accelerators with OpenStack , 2014, FCCM 2014.

[5]  Divyakant Agrawal,et al.  Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.

[6]  Wayne Luk,et al.  Have GPUs made FPGAs redundant in the field of video processing? , 2005, Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005..

[7]  Martin Lilleeng Sætra,et al.  Graphics processing unit (GPU) programming strategies and trends in GPU computing , 2013, J. Parallel Distributed Comput..

[8]  Ramarathnam Venkatesan,et al.  FPGAs for trusted cloud computing , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[9]  Kenli Li,et al.  vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines , 2012, IEEE Trans. Computers.

[10]  Kishore Singhal,et al.  Fast and memory-efficient minimum spanning tree on the GPU , 2013, Int. J. Comput. Sci. Eng..

[11]  Wei Wang,et al.  pvFPGA: Accessing an FPGA-based hardware accelerator in a paravirtualized environment , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[12]  Scott Devine,et al.  Bringing Virtualization to the x86 Architecture with the Original VMware Workstation , 2012, TOCS.

[13]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[14]  Srimat T. Chakradhar,et al.  Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework , 2011, HPDC '11.

[15]  Khaled Benkrid,et al.  High-Performance Quasi-Monte Carlo Financial Simulation: FPGA vs. GPP vs. GPU , 2010, TRETS.

[16]  Eyal de Lara,et al.  VMM-independent graphics acceleration , 2007, VEE '07.

[17]  Vanish Talwar,et al.  GViM: GPU-accelerated virtual machines , 2009, HPCVirt '09.

[18]  Gustavo Sutter,et al.  Virtualization of reconfigurable coprocessors in HPRC systems with multicore architecture , 2012, J. Syst. Archit..

[19]  David Chisnall,et al.  The Definitive Guide to the Xen Hypervisor , 2007 .

[20]  Ju Hahn Lee,et al.  Development of FPGA-based digital signal processing system for radiation spectroscopy , 2013 .

[21]  Youn-Long Lin,et al.  A platform based SOC design methodology and its application in image compression , 2005, Int. J. Embed. Syst..

[22]  João Canas Ferreira,et al.  Support for partial run-time reconfiguration of platform FPGAs , 2006, J. Syst. Archit..

[23]  Gil Neiger,et al.  Intel virtualization technology , 2005, Computer.

[24]  Hans Jürgen Mattausch,et al.  Image segmentation and pattern matching based FPGA/ASIC implementation architecture of real-time object tracking , 2006, Asia and South Pacific Conference on Design Automation, 2006..

[25]  Douglas L. Maskell,et al.  Virtualized Execution and Management of Hardware Tasks on a Hybrid ARM-FPGA Platform , 2014, J. Signal Process. Syst..

[26]  John F. Canny,et al.  Big data analytics with small footprint: squaring the cloud , 2013, KDD.

[27]  T. El-Ghazawi,et al.  Virtualizing and sharing reconfigurable resources in High-Performance Reconfigurable Computing systems , 2008, 2008 Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications.

[28]  Fumihiko Ino,et al.  A task parallel algorithm for finding all-pairs shortest paths using the GPU , 2012, Int. J. High Perform. Comput. Netw..

[29]  Koen Bertels,et al.  Toward a runtime system for reconfigurable computers: A virtualization approach , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[30]  A. Kivity,et al.  kvm : the Linux Virtual Machine Monitor , 2007 .

[31]  Chun-Hsian Huang,et al.  Virtualizable hardware/software design infrastructure for dynamically partially reconfigurable systems , 2013, TRETS.

[32]  John Paul Walters,et al.  A Comparison of Virtualization Technologies for HPC , 2008, 22nd International Conference on Advanced Information Networking and Applications (aina 2008).

[33]  Chun-Hsian Huang,et al.  Model-based platform-specific co-design methodology for dynamically partially reconfigurable systems with hardware virtualization and preemption , 2010, J. Syst. Archit..

[34]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[35]  James C. Hoe,et al.  Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs? , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[36]  Enno Lübbers Multithreaded Programming and Execution Models for Reconfigurable Hardware , 2010 .

[37]  Kevin Skadron,et al.  Accelerating Compute-Intensive Applications with GPUs and FPGAs , 2008, 2008 Symposium on Application Specific Processors.