Configuration and Programming of Heterogeneous Multiprocessors on a Multi-FPGA System Using TMD-MPI

Recent research has shown that FPGAs have true potential to speedup demanding applications even further than what state-of-the art superscalar processors can do. The penalty is the loss of generality in the architecture, but reconfigurability of FPGAs allows them to be reprogrammed for other applications. Therefore, an efficient programming model and a flexible design flow are paramount for this technology to be more widely accepted. Furthermore, in the history of computers, standards have been a positive experience because they provide a common ground for research and development. A programming model for multiprocessor Systems-On-FPGAs should be standard and application independent, but optimized for a particular architecture. In this paper, we use TMD-MPI, a subset implementation of the message passing standard MPI, and a flexible system-level design flow to implement heterogeneous multiprocessor systems-on-chip on FPGAs. Hardware engines are also by using a message passing engine, which encapsulates the TMD-MPI functionality in hardware, to enable the communication between hardware engines and embedded processors. We test the functionality and scalability of the system by implementing a 45-processor system across five FPGAs. As a test example, we solve the heat equation by using the Jacobi iterations method. Some performance metrics are measured to demonstrate the impact of different computing cores on the overall computation

[1]  J. Davenport Editor , 1960 .

[2]  Editors , 1986, Brain Research Bulletin.

[3]  Corporate The MPI Forum,et al.  MPI: a message passing interface , 1993, Supercomputing '93.

[4]  Jianping Zhu Solving partial differential equations on parallel computers , 1994 .

[5]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[6]  J. Makino,et al.  PROGRAPE-1: a programmable special-purpose computer for many-body simulations , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[7]  Pierre G. Paulin,et al.  Parallel programming models for a multi-processor SoC platform applied to high-speed traffic management , 2004, CODES+ISSS '04.

[8]  Lesley Shannon,et al.  Maximizing system performance: using reconfigurability to monitor system communications , 2004, Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921).

[9]  Paul Marchal,et al.  Flexible hardware/software support for message passing on a distributed shared memory architecture , 2005, Design, Automation and Test in Europe.

[10]  John Wawrzynek,et al.  BEE2: a high-end reconfigurable computing system , 2005, IEEE Design & Test of Computers.

[11]  Norman P. Jouppi,et al.  Heterogeneous chip multiprocessors , 2005, Computer.

[12]  John A. Williams,et al.  FIFO communication models in operating systems for reconfigurable computing , 2005, 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05).

[13]  Christopher John Comis,et al.  A High-speed Inter-process Communication Architecture for FPGA-based Hardware Acceleration of Molecular Dynamics , 2005 .

[14]  D. Geer,et al.  Chip makers turn to multicore processors , 2005, Computer.

[15]  Paul Chow,et al.  TMD-MPI: An MPI Implementation for Multiple Processors Across Multiple FPGAs , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[16]  Paul Chow,et al.  A Scalable FPGA-based Multiprocessor , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[17]  Jason D. Bakos,et al.  A Reconfigurable Distributed Computing Fabric Exploiting Multilevel Parallelism , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.