Temperature-Aware Optimization of Monolithic 3D Deep Neural Network Accelerators

We propose an automated method to facilitate the design of energy- efficient Mono3D DNN accelerators with safe on-chip temperatures for mobile systems. We introduce an optimizer to investigate the effect of different aspect ratios and footprint specifications of the chip, and select energy-efficient accelerators under user-specified thermal and performance constraints. We also demonstrate that using our optimizer, we can reduce energy consumption by 1.6× and area by 2× with a maximum of 9.5% increase in latency compared to a Mono3D DNN accelerator optimized only for performance.

[1]  Niraj K. Jha,et al.  SPRING: A Sparsity-Aware Reduced-Precision Monolithic 3D CNN Accelerator Architecture for Training and Inference , 2019, IEEE Transactions on Emerging Topics in Computing.

[2]  Sung Woo Chung,et al.  Enhancing Matrix Multiplication With a Monolithic 3-D-Based Scratchpad Memory , 2021, IEEE Embedded Systems Letters.

[3]  Midia Reshadi,et al.  Flow mapping and data distribution on mesh-based deep learning accelerator , 2019, NOCS.

[4]  Onur Mutlu,et al.  EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM , 2019, MICRO.

[5]  H.-S. Philip Wong,et al.  On-Chip Memory Technology Design Space Explorations for Mobile Deep Neural Network Accelerators , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[6]  Rachel Huang,et al.  YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[7]  Matthew Mattina,et al.  SCALE-Sim: Systolic CNN Accelerator , 2018, ArXiv.

[8]  Niraj K. Jha,et al.  A Monolithic 3D Hybrid Architecture for Energy-Efficient Computation , 2018, IEEE Transactions on Multi-Scale Computing Systems.

[9]  Paolo Napoletano,et al.  Benchmark Analysis of Representative Deep Neural Network Architectures , 2018, IEEE Access.

[10]  Emre Salman,et al.  Mono3D: Open Source Cell Library for Monolithic 3-D Integrated Circuits , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[11]  Yu Cao,et al.  Monolithic 3D IC designs for low-power deep neural networks targeting speech recognition , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[12]  Gu-Yeon Wei,et al.  A case for efficient accelerator design space exploration via Bayesian optimization , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[13]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[14]  Christoforos E. Kozyrakis,et al.  TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.

[15]  Sung Kyu Lim,et al.  A 14nm FinFET transistor-level 3D partitioning design to enable high-performance and low-cost monolithic 3D IC , 2016, 2016 IEEE International Electron Devices Meeting (IEDM).

[16]  Wu-chun Feng,et al.  Measuring and modeling on-chip interconnect power on real hardware , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[17]  O. Faynot,et al.  First demonstration of a CMOS over CMOS 3D VLSI CoolCube™ integration on 300mm wafers , 2016, 2016 IEEE Symposium on VLSI Technology.

[18]  Dirk Herrmann,et al.  Three Dimensional Integrated Circuit Design , 2016 .

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Sung Kyu Lim,et al.  Design and CAD methodologies for low power gate-level monolithic 3D ICs , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[21]  Sung Kyu Lim,et al.  Fast and accurate thermal modeling and optimization for monolithic 3D ICs , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[22]  Sung Kyu Lim,et al.  Power-performance study of block-level monolithic 3D-ICs considering inter-tier performance variations , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[23]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[24]  Li Shang,et al.  Accurate Temperature-Dependent Integrated Circuit Leakage Power Estimation is Easy , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[25]  Kevin Skadron,et al.  HotSpot: a compact thermal modeling methodology for early-stage VLSI design , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.