Embracing Diversity: Enhanced DSP Blocks for Low-Precision Deep Learning on FPGAs
暂无分享,去创建一个
[1] Andrew C. Ling,et al. An OpenCL™ Deep Learning Accelerator on Arria 10 , 2017, FPGA.
[2] Paolo Ienne,et al. Highly Versatile DSP Blocks for Improved FPGA Arithmetic Performance , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.
[3] Wei Zhang,et al. Fracturable DSP Block for Multi-context Reconfigurable Architectures , 2017, Circuits Syst. Signal Process..
[4] Vaughn Betz,et al. Quantifying the Gap Between FPGA and Custom CMOS to Aid Microarchitectural Design , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[5] Kyuyeon Hwang,et al. Fixed-point feedforward deep neural network design using weights +1, 0, and −1 , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).
[6] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[7] Abhisek Kundu,et al. Mixed Low-precision Deep Learning Inference using Dynamic Fixed Point , 2017, ArXiv.
[8] Bruce A. Wooley,et al. A Two's Complement Parallel Array Multiplication Algorithm , 1973, IEEE Transactions on Computers.
[9] Yu Cao,et al. Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks , 2017, FPGA.
[10] Lin Xu,et al. Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.
[11] Eriko Nurvitadhi,et al. WRPN: Wide Reduced-Precision Networks , 2017, ICLR.
[12] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[13] Soheil Ghiasi,et al. Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.
[14] Martin Langhammer,et al. Floating-Point DSP Block Architecture for FPGAs , 2015, FPGA.
[15] Martin Margala,et al. Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs , 2018, FCCM.
[16] J. L. Holt,et al. Back propagation simulations using limited precision calculations , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[17] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[18] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[19] Vaughn Betz,et al. Automatic circuit design and modelling for heterogeneous FPGAs , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).
[20] Yoshua Bengio,et al. Training deep neural networks with low precision multiplications , 2014 .
[21] Vaughn Betz,et al. Comparing performance, productivity and scalability of the TILT overlay processor to OpenCL HLS , 2014, 2014 International Conference on Field-Programmable Technology (FPT).