Virtualization on the Tartan Reconfigurable Architecture

Spatial computing (SC) offers the potential for large improvements in performance and energy efficiency. Many proposed architectures have harnessed these benefits for small kernels. The Tartan architecture attempts to harness these advantages for entire general-purpose applications executing spatially. Previous work on Tartan had a configure-once model of execution, which required prohibitively large amounts of hardware resources to execute most programs. In this paper, we explore a virtualization model for Tartan. With virtualization, Tartan can execute large programs with a realistic amount of hardware, and with performance comparable to the configure-once model. We focus on three aspects of virtualization: runtime placement of code-blocks, location resolution methods for inter-block communication, and the impact of prefetching on reducing configuration delays. Our results show that the Tartan fabric can be virtualized with no loss in performance compared to a configure-once fabric of unlimited size.

[1]  Marco Platzner,et al.  Virtualization of Hardware - Introduction and Survey , 2004, ERSA.

[2]  Seth Copen Goldstein,et al.  Spatial computation , 2004, ASPLOS XI.

[3]  John Wawrzynek,et al.  Stream Computations Organized for Reconfigurable Execution (SCORE) , 2000, FPL.

[4]  Seth Copen Goldstein,et al.  Tartan: evaluating spatial computation for whole program execution , 2006, ASPLOS XII.

[5]  Andreas Moshovos,et al.  CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit , 2000, ISCA '00.

[6]  Seth Copen Goldstein,et al.  HLS Support for Unconstrained Memory Accesses , 2005 .

[7]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[8]  John Wawrzynek,et al.  Instruction-Level Parallelism for Reconfigurable Computing , 1998, FPL.

[9]  Fadi J. Kurdahi,et al.  Morphosys: case study of a reconfigurable computing system targeting multimedia applications , 2000, Proceedings 37th Design Automation Conference.

[10]  A. Smith,et al.  PRISM-II compiler and architecture , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.

[11]  Michael D. Smith,et al.  A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[12]  Seth Copen Goldstein,et al.  PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.

[13]  Zhiyuan Li,et al.  Configuration prefetching techniques for partial reconfigurable coprocessor with relocation and defragmentation , 2002, FPGA '02.

[14]  Marco Platzner,et al.  Online scheduling and placement of real-time tasks to partially reconfigurable devices , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[15]  Maya Gokhale,et al.  Co-Synthesis to a Hybrid RISC/FPGA Architecture , 2000, J. VLSI Signal Process..