Attacking the semantic gap between application programming languages and configurable hardware

It is difficult to exploit the massive, fine-grained parallelism of configurable hardware with a conventional application programðming language such as C, Pascal or Java. The difficulty arises from the mismatch between the synchronous, concurrent processing capability of the hardware and the expressiveness of the lanðguage-the so-called "semantic gap." We attack this problem by using a programming model matched to the hardware's capabilities that can be implemented in any (unmodified) object-oriented lanðguage, and building a corresponding compiler. The result is appliðcation code that can be developed, compiled, debugged and executed on a personal computer using conventional tools (such as Visual C++ or Visual Cafe), and then recompiled without modifiðcation to the configurable hardware target. A straightforward C++ implementation of the Serpent encryption algorithm compiled with our compiler onto a Virtex XCV1000 FPGA yielded an implemenðtation that was smaller (3200 vs. 4502 CLBs) and faster (77 MHz vs. 38 MHz) than an independent VHDL implementation with the same degree of pipelining. A tuned version of the source yielded an implementation that ran at 95 MHz.

[1]  Marc Michael Brandis Optimizing compilers for structured programming languages , 1995 .

[2]  Bruce A. Draper,et al.  A High Level, Algorithmic Programming Language and Compiler for Reconfigurable Systems , 2000, PDPTA.

[3]  David C. Ku,et al.  HardwareC -- A Language for Hardware Design (Version 2.0) , 1990 .

[4]  Brad L. Hutchings,et al.  JHDL-an HDL for reconfigurable systems , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[5]  Patrick Schaumont,et al.  An object oriented programming approach for hardware design , 1999, Proceedings. IEEE Computer Society Workshop on VLSI '99. System Design: Towards System-on-a-Chip Paradigm.

[6]  Victor Lee,et al.  The RAW benchmark suite: computation structures for general purpose computing , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[7]  Seth Copen Goldstein,et al.  PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.

[8]  Tommaso Toffoli,et al.  Cellular Automata Machines , 1987, Complex Syst..

[9]  Vivek Sarkar,et al.  Space-time scheduling of instruction-level parallelism on a raw machine , 1998, ASPLOS VIII.

[10]  李幼升,et al.  Ph , 1989 .

[11]  Horácio C. Neto,et al.  Fast hardware compilation of behaviors into an FPGA-based dynamic reconfigurable computing system , 1999, Proceedings. XII Symposium on Integrated Circuits and Systems Design (Cat. No.PR00387).

[12]  Niklaus Wirth Hardware Compilation: Translating Programs into Circuits , 1998, Computer.

[13]  Saman P. Amarasinghe,et al.  Maps: a compiler-managed memory system for Raw machines , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).

[14]  Eduardo Sanchez,et al.  A C++ compiler for FPGA custom execution units synthesis , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[15]  Michael J. Flynn,et al.  PAM-Blox: high performance FPGA design for adaptive computing , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[16]  John Wawrzynek,et al.  Instruction-Level Parallelism for Reconfigurable Computing , 1998, FPL.

[17]  Carl Ebeling,et al.  Mapping applications to the RaPiD configurable architecture , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[18]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[19]  Scott A. Mahlke,et al.  High-level synthesis of nonprogrammable hardware accelerators , 2000, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors.

[20]  S. E. Mitchell,et al.  TAO - a model for the integration of concurrency and synchronisation in object-oriented programming , 1995 .

[21]  Tommaso Toffoli,et al.  Cellular automata machines - a new environment for modeling , 1987, MIT Press series in scientific computation.

[22]  Csaba Andras Moritz,et al.  Parallelizing applications into silicon , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[23]  Lars E. Thon,et al.  From C to Silicon , 1992 .

[24]  SarkarVivek,et al.  Space-time scheduling of instruction-level parallelism on a raw machine , 1998 .

[25]  Scott Mahlke,et al.  Exploiting Instruction Level Parallelism in the Presence of Conditional Branches , 1997 .

[26]  Gregory S. Snider,et al.  A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology , 1998 .

[27]  Christof Paar,et al.  An FPGA implementation and performance evaluation of the Serpent block cipher , 2000, FPGA '00.

[28]  Bill Lin,et al.  Hardware compilation for FPGA-based configurable computing machines , 1999, DAC '99.

[29]  David R. Galloway The Transmogrifier C hardware description language and compiler for FPGAs , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.