An Energy-Efficient Sparse Deep-Neural-Network Learning Accelerator With Fine-Grained Mixed Precision of FP8–FP16
暂无分享,去创建一个
Hoi-Jun Yoo | Jinmook Lee | Jinsu Lee | Donghyeon Han | Juhyoung Lee | Gwangtae Park | H. Yoo | Donghyeon Han | Jinsu Lee | Jinmook Lee | Juhyoung Lee | Gwangtae Park
[1] Hoi-Jun Yoo,et al. UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[2] Meng-Fan Chang,et al. Sticker: A 0.41-62.1 TOPS/W 8Bit Neural Network Processor with Multi-Sparsity Compatible Convolution Arrays and Online Tuning Acceleration for Fully Connected Layers , 2018, 2018 IEEE Symposium on VLSI Circuits.
[3] Vikas Chandra,et al. Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations , 2017, ArXiv.
[4] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[5] Hoi-Jun Yoo,et al. 7.7 LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16 , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).
[6] Swagath Venkataramani,et al. A Scalable Multi-TeraOPS Core for AI Training and Inference , 2018, IEEE Solid-State Circuits Letters.
[7] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[8] Tadahiro Kuroda,et al. QUEST: A 7.49TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[9] Eunhyeok Park,et al. Energy-Efficient Neural Network Accelerator Based on Outlier-Aware Low-Precision Computation , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).