Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators

This paper presents Sparseloop, the first infrastructure that implements an analytical design space exploration methodology for sparse tensor accelerators. Sparseloop comprehends a wide set of architecture specifications including various sparse optimization features such as compressed tensor storage. Using these specifications, Sparseloop can calculate a design's energy efficiency while accounting for both optimization savings and metadata overhead at each storage and compute level of the architecture using stochastic tensor density models. We validate Sparseloop on a well-known accelerator design and achieve ∼99% accuracy in terms of runtime activities (e.g., compressed memory accesses). We also present a case study that highlights the key factors (e.g., uncompressed traffic, data density) that affect sparse optimization features' impact on energy efficiency. Tool available at: https://github.com/NVlabs/timeloop.

[1]  Guy Lemieux,et al.  Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Vivek Sarkar,et al.  MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings , 2020, IEEE Micro.

[4]  William J. Dally,et al.  SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Michael Stonebraker,et al.  Standards for graph algorithm primitives , 2014, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[7]  Vivienne Sze,et al.  Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[8]  Joel Emer,et al.  A method to estimate the energy consumption of deep neural networks , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[9]  Xin He,et al.  NNest: Early-Stage Design Space Exploration Tool for Neural Network Inference Accelerators , 2018, ISLPED.

[10]  Shaoli Liu,et al.  Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[11]  Brucek Khailany,et al.  Timeloop: A Systematic Approach to DNN Accelerator Evaluation , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[12]  Aamer Jaleel,et al.  ExTensor: An Accelerator for Sparse Tensor Algebra , 2019, MICRO.

[13]  Nitish Srivastava,et al.  MatRaptor: A Sparse-Sparse Matrix Multiplication Accelerator Based on Row-Wise Product , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[14]  Yue Wang,et al.  DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Dipankar Das,et al.  SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[16]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[17]  David Blaauw,et al.  OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[18]  Natalie D. Enright Jerger,et al.  Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[19]  Matthew Mattina,et al.  A Systematic Methodology for Characterizing Scalability of DNN Accelerators using SCALE-Sim , 2020, 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[20]  Vivienne Sze,et al.  An Architecture-Level Energy and Area Estimator for Processing-In-Memory Accelerator Designs , 2020, 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[21]  Song Han,et al.  SpArch: Efficient Architecture for Sparse Matrix Multiplication , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[22]  Vivienne Sze,et al.  Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.