论文信息 - Plasticine: A Reconfigurable Accelerator for Parallel Patterns

Plasticine: A Reconfigurable Accelerator for Parallel Patterns

Plasticine is a new spatially reconfigurable architecture designed to efficiently execute applications composed of high-level parallel patterns. With an area footprint of 113 mm2 in a 28-nm process and a 1-GHz clock, Plasticine has a peak floating-point performance of 12.3 single-precision Tflops and a total on-chip memory capacity of 16 MB, consuming a maximum power of 49 W. Plasticine provides an improvement of up to 76.9X in performance-per-watt over a conventional FPGA over a wide range of dense and sparse applications.

[1] Kunle Olukotun,et al. Locality-Aware Mapping of Nested Parallel Patterns on GPUs , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[2] Kunle Olukotun,et al. Generating Configurable Hardware from Parallel Patterns , 2015, International Conference on Architectural Support for Programming Languages and Operating Systems.

[3] Christoforos E. Kozyrakis,et al. Understanding sources of inefficiency in general-purpose chips , 2010, ISCA.

[4] Nam Sung Kim,et al. GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.

[5] Kunle Olukotun,et al. Hardware system synthesis from Domain-Specific Languages , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[6] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[7] Carl Ebeling,et al. Architecture design of reconfigurable pipelined datapaths , 1999, Proceedings 20th Anniversary Conference on Advanced Research in VLSI.

[8] Seth Copen Goldstein,et al. PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.

[9] Seth Copen Goldstein,et al. Tartan: evaluating spatial computation for whole program execution , 2006, ASPLOS XII.

[10] Kunle Olukotun,et al. Plasticine: A reconfigurable architecture for parallel patterns , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[11] Kunle Olukotun,et al. Spatial: a language and compiler for application accelerators , 2018, PLDI.

[12] Rudy Lauwereins,et al. ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix , 2003, FPL.

[13] W. Marsden. I and J , 2012 .

[14] Karthikeyan Sankaralingam,et al. DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing , 2012, IEEE Micro.

[15] Joel Emer,et al. Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .

[16] Kunle Olukotun,et al. Delite , 2014, ACM Trans. Embed. Comput. Syst..

[17] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.