A Parametrizable High-Level Synthesis Library for Accelerating Neural Networks on FPGAs
暂无分享,去创建一个
[1] Lei Feng,et al. An FPGA-Based CNN Accelerator Integrating Depthwise Separable Convolution , 2019, Electronics.
[2] Alexander V. Veidenbaum,et al. AFFIX: Automatic Acceleration Framework for FPGA Implementation of OpenVX Vision Algorithms , 2019, FPGA.
[3] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Phillip H. Jones,et al. Comparing Energy Efficiency of CPU, GPU and FPGA Implementations for Vision Kernels , 2019, 2019 IEEE International Conference on Embedded Software and Systems (ICESS).
[5] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[6] Jason Cong,et al. Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[7] Yu Cao,et al. Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.
[8] Kari Pulli,et al. OpenVX: a framework for accelerating computer vision , 2016, SIGGRAPH ASIA Courses.
[9] Xi Chen,et al. FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[10] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Christos-Savvas Bouganis,et al. Latency-driven design for FPGA-based convolutional neural networks , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[12] Lester Kalms,et al. Accelerated High-Level Synthesis Feature Detection for FPGAs Using HiFlipVX , 2021 .
[13] Hassan Mostafa,et al. Implementation of deep neural networks on FPGA-CPU platform using Xilinx SDSOC , 2020, Analog Integrated Circuits and Signal Processing.
[14] Mei-Ling Shyu,et al. A Survey on Deep Learning , 2018, ACM Comput. Surv..
[15] Guy G.F. Lemieux,et al. JANUS: A Compilation System for Balancing Parallelism and Performance in OpenVX , 2018 .
[16] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[17] Patrick Judd,et al. Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation , 2020, ArXiv.
[18] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[19] Diana Göhringer,et al. HiFlipVX: An Open Source High-Level Synthesis FPGA Library for Image Processing , 2019, ARC.
[20] Diana Göhringer,et al. Resource Efficient Dynamic Voltage and Frequency Scaling on Xilinx FPGAs , 2020, ARC.
[21] Diana Göhringer,et al. Exploration of OpenCL for FPGAs using SDAccel and comparison to GPUs and multicore CPUs , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[22] Antonio Rios-Navarro,et al. Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs , 2016, ArXiv.
[23] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[24] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[25] Ricardo Tapiador-Morales,et al. Comprehensive Evaluation of OpenCL-Based CNN Implementations for FPGAs , 2017, IWANN.
[26] Xiaowei Li,et al. C-Brain: A deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[27] George A. Constantinides,et al. High-level synthesis of dynamic data structures: A case study using Vivado HLS , 2013, 2013 International Conference on Field-Programmable Technology (FPT).
[28] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[29] Yao Chen,et al. Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs , 2019, FPGA.
[30] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.
[31] Jie Xu,et al. DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[32] Yong Dou,et al. A Uniform Architecture Design for Accelerating 2D and 3D CNNs on FPGAs , 2019, Electronics.
[33] Yu Wang,et al. Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[34] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.