Enabling hard constraints in differentiable neural network and accelerator co-exploration

Co-exploration of an optimal neural architecture and its hardware accelerator is an approach of rising interest which addresses the computational cost problem, especially in low-profile systems. The large co-exploration space is often handled by adopting the idea of differentiable neural architecture search. However, despite the superior search efficiency of the differentiable co-exploration, it faces a critical challenge of not being able to systematically satisfy hard constraints such as frame rate. To handle the hard constraint problem of differentiable co-exploration, we propose HDX, which searches for hard-constrained solutions without compromising the global design objectives. By manipulating the gradients in the interest of the given hard constraint, high-quality solutions satisfying the constraint can be obtained.

[1]  Yingyan Lin,et al.  Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators , 2021, ICML.

[2]  Song Han,et al.  NAAS: Neural Accelerator Architecture Search , 2021, 2021 58th ACM/IEEE Design Automation Conference (DAC).

[3]  Kaushik Roy,et al.  Gradient Projection Memory for Continual Learning , 2021, ICLR.

[4]  Lihi Zelnik-Manor,et al.  HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search , 2021, ICML.

[5]  Noseong Park,et al.  DPM: A Novel Training Method for Physics-Informed Neural Networks in Extrapolation , 2020, AAAI.

[6]  Joonsang Yu,et al.  DANCE: Differentiable Accelerator/Network Co-Exploration , 2020, 2021 58th ACM/IEEE Design Automation Conference (DAC).

[7]  Yibo Hu,et al.  TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search , 2020, ECCV.

[8]  Matthew B. Blaschko,et al.  AOWS: Adaptive and Optimal Network Width Search With Latency Constraints , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jinjun Xiong,et al.  EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[10]  Vivek Sarkar,et al.  MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings , 2020, IEEE Micro.

[11]  Suyog Gupta,et al.  Accelerator-aware Neural Network Design using AutoML , 2020, ArXiv.

[12]  Thomas C. P. Chau,et al.  Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[13]  Meng Li,et al.  Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[14]  Vivienne Sze,et al.  Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[15]  Jingtong Hu,et al.  On Neural Architecture Search for Resource-Constrained Hardware Platforms , 2019, ArXiv.

[16]  Chuang Gan,et al.  Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[17]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jinjun Xiong,et al.  FPGA/DNN Co-Design: An Efficient Design Methodology for 1oT Intelligence on the Edge , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[19]  Brucek Khailany,et al.  Timeloop: A Systematic Approach to DNN Accelerator Evaluation , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[20]  Lei Yang,et al.  Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[21]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[22]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[23]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[25]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[26]  Vivienne Sze,et al.  Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[27]  V. Sze,et al.  Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2016, IEEE Journal of Solid-State Circuits.

[28]  Tianshi Chen,et al.  ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[29]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[32]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .