A fast and scalable architecture to run convolutional neural networks in low density FPGAs
暂无分享,去创建一个
[1] I. Guyon,et al. Handwritten digit recognition: applications of neural network chips and automatic learning , 1989, IEEE Communications Magazine.
[2] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[3] Chao Wang,et al. MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[4] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.
[5] Junzhong Shen,et al. FPGA‐accelerated deep convolutional neural networks for high throughput and energy efficiency , 2017, Concurr. Comput. Pract. Exp..
[6] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[7] Michael Ferdman,et al. Maximizing CNN accelerator efficiency through resource partitioning , 2016, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[8] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.
[9] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Yu Wang,et al. Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[11] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[12] Wayne Luk,et al. FP-BNN: Binarized neural network on FPGA , 2018, Neurocomputing.
[13] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[14] Shijie Li,et al. Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks , 2017, ACM Trans. Reconfigurable Technol. Syst..
[15] E. George Walters. Array Multipliers for High Throughput in Xilinx FPGAs with 6-Input LUTs , 2016, Comput..
[16] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[17] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[18] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[19] Soheil Ghiasi,et al. Design space exploration of FPGA-based Deep Convolutional Neural Networks , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).
[20] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[21] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Michael Ferdman,et al. Escher: A CNN Accelerator with Flexible Buffering to Minimize Off-Chip Transfer , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[23] Srihari Cadambi,et al. A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.
[24] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.
[25] Christos-Savvas Bouganis,et al. fpgaConvNet: Mapping Regular and Irregular Convolutional Neural Networks on FPGAs , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[26] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[27] Soheil Ghiasi,et al. Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.
[28] Qiuwen Lou,et al. Design Flow of Accelerating Hybrid Extremely Low Bit-Width Neural Network in Embedded FPGA , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[29] Mário P. Véstias,et al. Parallel dot-products for deep learning on FPGA , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[30] Yu Cao,et al. Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.