Compilation Increasing the Scheduling Scope for Multi-memory-FPGA-Based Custom Computing Machines

This paper presents new achievements on the automatic mapping of abstract algorithms, written in imperative software programming languages, to custom computing machines. The reconfigurable hardware element of the target architecture consists of one field-programmable gate array coupled with one or more memories. The compilation flow exposes operation- and functional-level parallelism, and speculative execution. Such expositions are efficiently represented in a hierarchical model. In order to take full advantage of such representation, the scheduling scope is significantly improved by merging basic blocks at loop boundaries and by considering the parallel execution of exposed concurrent loops. The paper describes the scheduling technique, shows a study on the impact of the merge operation, and reveals the improvements achieved when the exposed parallelism is fully satisfied.

[1]  Michael A. Langston,et al.  Automatic Mapping of Multiple Applications to Multiple Adaptive Computing Systems , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[2]  Pedro C. Diniz,et al.  Automatic synthesis of data storage and control structures for FPGA-based computing engines , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[3]  Maya Gokhale,et al.  NAPA C: compiling for a hybrid RISC/FPGA architecture , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[4]  Alex K. Jones,et al.  A MATLAB compiler for distributed, heterogeneous, reconfigurable computing systems , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[5]  Bruce A. Draper,et al.  Cameron: high level language compilation for reconfigurable systems , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[6]  S. Ghosh,et al.  An asynchronous approach to efficient execution of programs on adaptive architectures utilizing FPGAs , 1994, Proceedings of IEEE Workshop on FPGA's for Custom Computing Machines.

[7]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[8]  John Wawrzynek,et al.  The Garp Architecture and C Compiler , 2000, Computer.

[9]  Kenneth L. Pocek,et al.  Proceedings : Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines FCCM '99, April 21-23, 1999, Napa Valley, California , 1999 .

[10]  Ranga Vemuri,et al.  An Integrated Partitioning and Synthesis System for Dynamically Reconfigurable Multi-FPGA Architectures , 1998, IPPS/SPDP Workshops.

[11]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[12]  Csaba Andras Moritz,et al.  Parallelizing applications into silicon , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[13]  Niraj K. Jha,et al.  Wavesched: a novel scheduling technique for control-flow intensive designs , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[14]  Arturo A. Rodriguez,et al.  Evaluating Video Codecs , 1994, IEEE MultiMedia.

[15]  David R. Galloway The Transmogrifier C hardware description language and compiler for FPGAs , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[16]  Robert Sedgewick,et al.  Algorithms in C , 1990 .

[17]  Harvey F. Silverman,et al.  Processor reconfiguration through instruction-set metamorphosis , 1993, Computer.

[18]  Reiner W. Hartenstein,et al.  Parallelization in Co-Compilation for Configurable Accelerators. , 1998 .

[19]  Wayne Luk,et al.  Pipeline vectorization , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[20]  J A Fisher,et al.  Instruction-Level Parallel Processing , 1991, Science.

[21]  Horácio C. Neto,et al.  Macro-based hardware compilation of Java/sup TM/ bytecodes into a dynamic reconfigurable computing system , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[22]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[23]  Horácio C. Neto,et al.  An Enhanced Static-List Scheduling Algorithm for Temporal Partitioning onto RPUs , 1999, VLSI.

[24]  Scott Hauck The Future of Reconfigurable Systems , 1998 .

[25]  Milind Girkar,et al.  Automatic Extraction of Functional Parallelism from Ordinary Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[26]  Tsutomu Maruyama,et al.  A C to HDL compiler for pipeline processing on FPGAs , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[27]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .