Mapping Multi-Domain Applications Onto Coarse-Grained Reconfigurable Architectures

Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their performance and flexibility. However, their applications have been restricted to domains based on integer arithmetic since typical CGRAs support only integer arithmetic or logical operations. This paper introduces approaches to mapping applications onto CGRAs supporting both integer and floating-point arithmetic. After presenting an optimal formulation using integer linear programming, we present a fast heuristic mapping algorithm. Our experiments on randomly generated examples generate optimal mapping results using our heuristic algorithm for 97% of the examples within a few seconds. We observe similar results for practical examples from multimedia and 3-D graphics benchmarks. The applications mapped on a CGRA show up to 120 times performance improvement compared to software implementations, demonstrating the potential for application acceleration on CGRAs supporting floating-point operations.

[1]  Pedro C. Diniz,et al.  Compilation Techniques for Reconfigurable Architectures , 2008 .

[2]  Yunheung Paek,et al.  Power-Conscious Configuration Cache Structure and Code Mapping for Coarse-Grained Reconfigurable Architecture , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[3]  Kiyoung Choi,et al.  Routing-Aware Application Mapping Considering Steiner Points for Coarse-Grained Reconfigurable Architecture , 2010, ARC.

[4]  Maya Gokhale,et al.  Trident: an FPGA compiler framework for floating-point algorithms , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[5]  Li Jing,et al.  High-Level Synthesis Challenges and Solutions for a Dynamically Reconfigurable Processor , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[6]  Soonhoi Ha,et al.  On-chip communication architecture exploration for processor-pool-based MPSoC , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[7]  Kiyoung Choi,et al.  SoCDAL: System-on-chip design AcceLerator , 2008, TODE.

[8]  Jong-Hwan Kim,et al.  Quantum-Inspired Evolutionary Algorithms With a New Termination Criterion , H Gate , and Two-Phase Scheme , 2009 .

[9]  Soonhoi Ha,et al.  Pipelined data parallel task mapping/scheduling technique for MPSoC , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[10]  Rudy Lauwereins,et al.  ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix , 2003, FPL.

[11]  Kiyoung Choi,et al.  Resource sharing and pipelining in coarse-grained reconfigurable architecture for domain-specific optimization , 2005, Design, Automation and Test in Europe.

[12]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[13]  Scott A. Mahlke,et al.  Edge-centric modulo scheduling for coarse-grained reconfigurable architectures , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[14]  Kiyoung Choi,et al.  Implementation of floating-point operations for 3D graphics on a coarse-grained reconfigurable architecture , 2007, SoCC.

[15]  Vivek Sarkar,et al.  Space-time scheduling of instruction-level parallelism on a raw machine , 1998, ASPLOS VIII.

[16]  Kiyoung Choi,et al.  Automatic mapping of application to coarse-grained reconfigurable architecture based on high-level synthesis techniques , 2008, 2008 International SoC Design Conference.

[17]  Charles Oliver Area efficient layouts of binary trees in grids , 2001 .

[18]  Jong-Hwan Kim,et al.  Quantum-inspired evolutionary algorithms with a new termination criterion, H/sub /spl epsi// gate, and two-phase scheme , 2004, IEEE Transactions on Evolutionary Computation.

[19]  Andrew D. Brown,et al.  Floating-point behavioral synthesis , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[20]  Soonhoi Ha,et al.  Architecture Exploration of NAND Flash-based Multimedia Card , 2008, 2008 Design, Automation and Test in Europe.

[21]  Aviral Shrivastava,et al.  A Graph Drawing Based Spatial Mapping Algorithm for Coarse-Grained Reconfigurable Architectures , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[22]  Khalid H. Abed,et al.  Design Heuristics for Mapping Floating-Point Scientific Computational Kernels onto High Performance Reconfigurable Computers , 2009, J. Comput..

[23]  John Wawrzynek,et al.  Instruction-Level Parallelism for Reconfigurable Computing , 1998, FPL.

[24]  Carl Ebeling,et al.  SPR: an architecture-adaptive CGRA mapping tool , 2009, FPGA '09.

[25]  Rudy Lauwereins,et al.  DRESC: a retargetable compiler for coarse-grained reconfigurable architectures , 2002, 2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings..