Automated Design Space Exploration of CGRA Processing Element Architectures using Frequent Subgraph Analysis

The architecture of a coarse-grained reconfigurable array (CGRA) processing element (PE) has a significant effect on the performance and energy efficiency of an application running on the CGRA. This paper presents an automated approach for generating specialized PE architectures for an application or an application domain. Frequent subgraphs mined from a set of applications are merged to form a PE architecture specialized to that application domain. For the image processing and machine learning domains, we generate specialized PEs that are up to 10.5× more energy efficient and consume 9.1× less area than a baseline PE.

[1]  Panos Kalnis,et al.  GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph , 2014, Proc. VLDB Endow..

[2]  Kunle Olukotun,et al.  Plasticine: A reconfigurable architecture for parallel patterns , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[3]  Paolo Bonzini,et al.  Design and Architectural Exploration of Expression-Grained Reconfigurable Arrays , 2008, 2008 Symposium on Application Specific Processors.

[4]  Jinjun Xiong,et al.  DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[5]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[6]  Christopher Torng,et al.  Creating an Agile Hardware Design Flow , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[7]  Asit K. Mishra,et al.  From high-level deep neural models to FPGAs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[8]  William J. Dally,et al.  Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture , 2019, MICRO.

[9]  Kunle Olukotun,et al.  Practical Design Space Exploration , 2018, 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).

[10]  Ross Daly,et al.  Invoking and Linking Generators from Multiple Hardware Languages using CoreIR , 2018 .

[11]  Cid C. de Souza,et al.  Efficient datapath merging for partially reconfigurable architectures , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Jiawei Han,et al.  Mining Graph Patterns , 2014, Frequent Pattern Mining.

[13]  William J. Dally,et al.  MAGNet: A Modular Accelerator Generator for Neural Networks , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[14]  Mark Horowitz,et al.  Evaluating programmable architectures for imaging and vision applications , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[15]  Russell Tessier,et al.  Reconfigurable Computing Architectures , 2015, Proceedings of the IEEE.

[16]  Frédo Durand,et al.  Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.