Squeezing the Last MHz for CNN Acceleration on FPGAs
暂无分享,去创建一个
Ying Wang | Li Li | Dawen Xu | Huawei Li | Xiaowei Li | Cheng Liu | Kouzi Xing
[1] K. Saban. Xilinx Stacked Silicon Interconnect Technology Delivers Breakthrough FPGA Capacity , Bandwidth , and Power Efficiency , 2009 .
[2] Dong Wang,et al. PipeCNN: An OpenCL-based open-source FPGA accelerator for convolution neural networks , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).
[3] George A. Constantinides,et al. Accuracy-Performance Tradeoffs on an FPGA through Overclocking , 2013, 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines.
[4] Kyuyeon Hwang,et al. Fixed-point feedforward deep neural network design using weights +1, 0, and −1 , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).
[5] Trevor Mudge,et al. Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[6] Jason Cong,et al. Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[7] Josep Torrellas,et al. Paceline: Improving Single-Thread Performance in Nanoscale CMPs through Core Overclocking , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[8] Jason Cong,et al. Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[9] Jie Xu,et al. DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[10] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[11] Augustus K. Uht. Going beyond worst-case specs with TEAtime , 2004, Computer.
[12] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[13] Sherief Reda,et al. DRUM: A Dynamic Range Unbiased Multiplier for approximate applications , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).