Laius: An 8-Bit Fixed-Point CNN Hardware Inference Engine
暂无分享,去创建一个
Wenyuan Lu | Lei Wang | Haifang Zhou | Yu Deng | Qiang Dou | Zhisheng Li | Shasha Guo | L. Wang | Q. Dou | Shasha Guo | Zhisheng Li | Haifang Zhou | Yu Deng | Wenyuan Lu
[1] Yuxing Tang,et al. FixCaffe: Training CNN with Low Precision Arithmetic Operations by Fixed Point Caffe , 2017, APPT.
[2] Wenguang Chen,et al. NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[3] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[4] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[5] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[6] Yann LeCun,et al. CNP: An FPGA-based processor for Convolutional Networks , 2009, 2009 International Conference on Field Programmable Logic and Applications.
[7] Michael Ferdman,et al. Overcoming resource underutilization in spatial CNN accelerators , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[8] R. Sindhu Reddy,et al. DLAU: A Scalable Deep Learning Accelerator Unit on FPGA , 2018 .
[9] Hao Yu,et al. A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks , 2017, ACM J. Emerg. Technol. Comput. Syst..
[10] Michael Ferdman,et al. Maximizing CNN accelerator efficiency through resource partitioning , 2016, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[11] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[12] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[13] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[14] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[15] Yu Cao,et al. Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.
[16] Dong Wang,et al. PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks , 2016, ArXiv.
[17] Luca Benini,et al. Curbing the roofline: a scalable and flexible architecture for CNNs on FPGA , 2016, Conf. Computing Frontiers.
[18] Jason Cong,et al. Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster , 2016, ISLPED.
[19] Soheil Ghiasi,et al. Design space exploration of FPGA-based Deep Convolutional Neural Networks , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).