论文信息 - You Only Search Once: A Fast Automation Framework for Single-Stage DNN/Accelerator Co-design

You Only Search Once: A Fast Automation Framework for Single-Stage DNN/Accelerator Co-design

DNN/Accelerator co-design has shown great potential in improving QoR and performance. Typical approaches separate the design flow into two-stage: (1) designing an application-specific DNN model with high accuracy; (2) building an accelerator considering the DNN specific characteristics. However, it may fails in promising the highest composite score which combines the goals of accuracy and other hardware-related constraints (e.g., latency, energy efficiency) when building a specific neural-network-based system. In this work, we present a single-stage automated framework, YOSO, aiming to generate the optimal solution of software-and-hardware that flexibly balances between the goal of accuracy, power, and QoS. Compared with the two-stage method on the baseline systolic array accelerator and Cifar10 dataset, we achieve 1.42x~2.29x energy or 1.79x~3.07x latency reduction at the same level of precision, for different user-specified energy and latency optimization constraints, respectively.

[1] Wenguang Chen,et al. NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2] Soonhoi Ha,et al. Fast Performance Estimation and Design Space Exploration of Manycore-based Neural Processors , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[3] Niraj K. Jha,et al. Software-Defined Design Space Exploration for an Efficient DNN Accelerator Architecture , 2019, IEEE Transactions on Computers.

[4] Jinjun Xiong,et al. FPGA/DNN Co-Design: An Efficient Design Methodology for 1oT Intelligence on the Edge , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[5] Kurt Keutzer,et al. Invited: Co-Design of Deep Neural Nets and Neural Net Accelerators for Embedded Vision Applications , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[6] Xiaowei Li,et al. C-Brain: A deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[7] Luciano Lavagno,et al. Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs , 2018, FPGA.

[8] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[10] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.

[11] Kurt Keutzer,et al. Invited: Co-Design of Deep Neural Nets and Neural Net Accelerators for Embedded Vision Applications , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[12] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[13] Song Han,et al. Design Automation for Efficient Deep Learning Computing , 2019, ArXiv.

[14] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.

[15] Niraj K. Jha,et al. ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[17] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.