Compiling Techniques for Coarse Grained Runtime Reconfigurable Architectures

In this paper we develop compilation techniques for the realization of applications described in a High Level Language (HLL) onto a Runtime Reconfigurable Architecture. The compiler determines Hyper Operations (HyperOps) that are subgraphs of a data flow graph (of an application) and comprise elementary operations that have strong producer-consumer relationship. These HyperOps are hosted on computation structures that are provisioned on demand at runtime. We also report compiler optimizations that collectively reduce the overheads of data-driven computations in runtime reconfigurable architectures. On an average, HyperOps offer a 44% reduction in total execution time and a 18% reduction in management overheads as compared to using basic blocks as coarse grained operations. We show that HyperOps formed using our compiler are suitable to support data flow software pipelining.

[1]  Guang R. Gao Algorithmic Aspects of Balancing Techniques for Pipelined Data Flow Code Generation , 1989, J. Parallel Distributed Comput..

[2]  Yasuhiro Inagami,et al.  The specification of a new Manchester Dataflow machine , 1989, ICS '89.

[3]  Sheldon B. Akers,et al.  Binary Decision Diagrams , 1978, IEEE Transactions on Computers.

[4]  Nikil D. Dutt,et al.  SPARK: a high-level synthesis framework for applying parallelizing compiler transformations , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[5]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[6]  S. K. Nandy,et al.  Synthesis of application accelerators on Runtime Reconfigurable Hardware , 2008, 2008 International Conference on Application-Specific Systems, Architectures and Processors.

[7]  Scott Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.

[8]  Guang R. Gao,et al.  An efficient pipelined dataflow processor architecture , 1988, Proceedings. SUPERCOMPUTING '88.

[9]  Steven Swanson,et al.  Reducing control overhead in dataflow architectures , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[10]  Jalal Kawash,et al.  Specifying memory consistency of write buffer multiprocessors , 2007, TOCS.

[11]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[12]  S. K. Nandy,et al.  REDEFINE: Architecture of a SoC Fabric for Runtime Composition of Computation Structures , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[13]  Keshav Pingali,et al.  From Control Flow to Dataflow , 1991, J. Parallel Distributed Comput..

[14]  S. K. Nandy,et al.  RECONNECT: A NoC for polymorphic ASICs using a low overhead single cycle router , 2008, 2008 International Conference on Application-Specific Systems, Architectures and Processors.

[15]  Steven Swanson,et al.  The WaveScalar architecture , 2007, TOCS.

[16]  Stamatis Vassiliadis,et al.  The MOLEN rho-mu-Coded Processor , 2001, FPL.

[17]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.