Programming framework for clusters with heterogeneous accelerators

We describe a programming framework for high performance clusters with various hardware accelerators. In this framework, users can utilize the available heterogeneous resources productively and efficiently. The distributed application is highly modularized to support dynamic system configuration with changing types and number of the accelerators. Multiple layers of communication interface are introduced to reduce the overhead in both control messages and data transfers. Parallelism can be achieved by controlling the accelerators in various schemes through scheduling extension. The framework has been used to support physics simulation and financial application development. We achieve significant performance improvement on a 16-node cluster with FPGA and GPU accelerators.

[1]  W. Luk,et al.  Axel: a heterogeneous cluster with FPGAs and GPUs , 2010, FPGA '10.

[2]  Wayne Luk,et al.  Accelerating Quadrature Methods for Option Valuation , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[3]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.

[4]  Anne-Marie Kermarrec,et al.  The many faces of publish/subscribe , 2003, CSUR.

[5]  Stephen Booth,et al.  Maxwell - a 64 FPGA Supercomputer , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).

[6]  Junichiro Makino,et al.  The GRAPE project , 2006, Computing in science & engineering (Print).

[7]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[8]  Volodymyr V. Kindratenko,et al.  Phoenix: A Runtime Environment for High Performance Computing on Chip Multiprocessors , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[9]  Satoshi Matsuoka,et al.  Massive supercomputing coping with heterogeneity of modern accelerators , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[10]  Khaled Benkrid,et al.  High-Performance Quasi-Monte Carlo Financial Simulation: FPGA vs. GPP vs. GPU , 2010, TRETS.

[11]  Volodymyr Kindratenko,et al.  QP: A Heterogeneous Multi-Accelerator Cluster , 2011 .

[12]  Stephen Booth,et al.  The FPGA High-Performance Computing Alliance Parallel Toolkit , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).