Temporal Partitioning and Scheduling Data Flow Graphs for Reconfigurable Computers

FPGA-based configurable computing machines are evolving rapidly. They offer the ability to deliver very high performance at a fraction of the cost when compared to supercomputers. The first generation of configurable computers (those with multiple FPGAs connected using a specific interconnect) used statically reconfigurable FPGAs. On these configurable computers, computations are performed by partitioning an entire task into spatially interconnected subtasks. Such configurable computers are used in logic emulation systems and for functional verification of hardware. In general, configurable computers provide the ability to reconfigure rapidly to any desired custom form. Hence, the available resources can be reused effectively to cut down the hardware costs and also improve the performance. In this paper, we introduce the concept of temporal partitioning to partition a task into temporally interconnected subtasks. Specifically, we present algorithms for temporal partitioning and scheduling data flow graphs for configurable computers. We are given a configurable computing unit (RPU) with a logic capacity of S/sub RPU/ and a computational task represented by an acyclic data flow graph G=(V, E). Computations with logic area requirements that exceed S/sub RPU/ cannot be completely mapped on a configurable computer (using traditional spatial mapping techniques). However, a temporal partitioning of the data flow graph followed by proper scheduling can facilitate the configurable computer based execution. Temporal partitioning of the data flow graph is a k-way partitioning of G=(V, E) such that each partitioned segment will not exceed S/sub RPU/ in its logic requirement. Scheduling assigns an execution order to the partitioned segments so as to ensure proper execution. Thus, for each segment in {s/sub 1/,s/sub 2/,...,s/sub k/}, scheduling assigns a unique ordering S/sub i/-j,1/spl les/i/spl les/k,1/spl les/j/spl les/k, such that the computation would execute in proper sequential order as defined by the flow graph G=(V, E).

[1]  D. V. Pryor,et al.  Text searching on Splash 2 , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.

[2]  Jonathan Rose,et al.  CALL FOR ARTICLES IEEE Design & Test of Computers Special Issue on Microprocessors , 1996 .

[3]  Alok Sharma,et al.  Empirical evaluation of some high-level synthesis scheduling heuristics , 1991, 28th ACM/IEEE Design Automation Conference.

[4]  K. Ho,et al.  Fast algorithms for computing the discrete cosine transform , 1992 .

[5]  Konstantinos Konstantinides,et al.  Image and Video Compression Standards: Algorithms and Architectures , 1997 .

[6]  Narasimha B. Bhat Novel Techniques for High Performance Field , 1993 .

[7]  Konstantinos Konstantinides,et al.  Image and video compression standards , 1995 .

[8]  J. P. Gray,et al.  Configurable hardware: a new paradigm for computation , 1989 .

[9]  Alice C. Parker,et al.  The high-level synthesis of digital systems , 1990, Proc. IEEE.

[10]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[11]  B. Stott,et al.  Asynchronous 2-D discrete cosine transform core processor , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[12]  Dinesh Bhatia,et al.  RACE: Reconfigurable and Adaptive Computing Environment , 1996, FPL.

[13]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[14]  Stephan W. Gehring,et al.  The Trianus System and Its Application to Custom Computing , 1996, FPL.

[15]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[16]  Steven Trimberger,et al.  A time-multiplexed FPGA , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[17]  André DeHon,et al.  DPGA Utilization and Application , 1996, Fourth International ACM Symposium on Field-Programmable Gate Arrays.

[18]  Vivek Sarkar,et al.  Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .

[19]  Scott Hauck,et al.  The roles of FPGAs in reprogrammable systems , 1998, Proc. IEEE.

[20]  Anant Agarwal,et al.  Logic emulation with virtual wires , 1997, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[21]  Steven Trimberger,et al.  Scheduling designs into a time-multiplexed FPGA , 1998, FPGA '98.

[22]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[23]  Mark Shand,et al.  Programmable active memories: reconfigurable systems come of age , 1996, IEEE Trans. Very Large Scale Integr. Syst..

[24]  Dinesh Bhatia,et al.  Emulating large designs on small reconfigurable hardware , 1998, Proceedings. Ninth International Workshop on Rapid System Prototyping (Cat. No.98TB100237).

[25]  Michael J. Flynn,et al.  Computer Architecture: Pipelined and Parallel Processor Design , 1995 .

[26]  Dinesh Bhatia,et al.  REACT: Reactive Environment for Runtime Reconfiguration , 1998, FPL.

[27]  Peter Y. K. Cheung,et al.  On the viability of FPGA-based integrated coprocessors , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[28]  Ernest S. Kuh,et al.  Novel techniques for high performance field-programmable logic devices , 1993 .

[29]  Scott Hauck Multi-FPGA systems , 1996 .

[30]  Joseph Varghese,et al.  An efficient logic emulation system , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[31]  Ernest S. Kuh,et al.  Performance-Oriented Fully Routable Dynamic Architecture for a Field , 1993 .

[32]  Minh N. Do,et al.  Youn-Long Steve Lin , 1992 .

[33]  Dinesh Bhatia,et al.  Temporal partitioning and scheduling for reconfigurable computing , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[34]  André DeHon,et al.  DPGA-coupled microprocessors: commodity ICs for the early 21st Century , 1994, Proceedings of IEEE Workshop on FPGA's for Custom Computing Machines.

[35]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[36]  Dzung T. Hoang,et al.  Searching genetic databases on Splash 2 , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.

[37]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..