Toward Full-Stack Acceleration of Deep Convolutional Neural Networks on FPGAs
暂无分享,去创建一个
[1] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[2] Feng Wu,et al. An Energy-Efficient Systolic Pipeline Architecture for Binary Convolutional Neural Network , 2019, 2019 IEEE 13th International Conference on ASIC (ASICON).
[3] Wayne Luk,et al. Towards an Efficient Accelerator for DNN-Based Remote Sensing Image Segmentation on FPGAs , 2019, 2019 29th International Conference on Field Programmable Logic and Applications (FPL).
[4] H. T. Kung,et al. Full-stack Optimization for Accelerating CNNs with FPGA Validation , 2019, ArXiv.
[5] Xiaoqian Zhang,et al. Compute-Efficient Neural-Network Acceleration , 2019, FPGA.
[6] Christos-Savvas Bouganis,et al. fpgaConvNet: Mapping Regular and Irregular Convolutional Neural Networks on FPGAs , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[7] Wayne Luk,et al. Optimizing CNN-based Segmentation with Deeply Customized Convolutional and Deconvolutional Architectures on FPGA , 2018, ACM Trans. Reconfigurable Technol. Syst..
[8] Wayne Luk,et al. Memory-Efficient Architecture for Accelerating Generative Networks on FPGA , 2018, 2018 International Conference on Field-Programmable Technology (FPT).
[9] Hideharu Amano,et al. Performance Estimation for Exascale Reconfigurable Dataflow Platforms , 2018, 2018 International Conference on Field-Programmable Technology (FPT).
[10] Nam Sung Kim,et al. FlexiGAN: An End-to-End Solution for FPGA Acceleration of Generative Adversarial Networks , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[11] Wayne Luk,et al. Towards Efficient Convolutional Neural Network for Domain-Specific Applications on FPGA , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[12] Leibo Liu,et al. GNA: Reconfigurable and Efficient Architecture for Generative Network Acceleration , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[13] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[14] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.
[15] Evangeline F. Y. Young,et al. Fast and Accurate Estimation of Quality of Results in High-Level Synthesis with Machine Learning , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[16] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[17] Farinaz Koushanfar,et al. ReBNet: Residual Binarized Neural Network , 2017, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[18] Peng Zhang,et al. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).
[19] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[20] Wayne Luk,et al. Optimizing CNN-Based Object Detection Algorithms on Embedded FPGA Platforms , 2017, ARC.
[21] Shengen Yan,et al. Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[22] Yu Cao,et al. Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks , 2017, FPGA.
[23] Viktor Prasanna,et al. Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System , 2017, FPGA.
[24] Andrew C. Ling,et al. An OpenCL™ Deep Learning Accelerator on Arria 10 , 2017, FPGA.
[25] Alexis Boukouvalas,et al. GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..
[26] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[27] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[29] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.
[30] Bin Liu,et al. Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Francesco Visin,et al. A guide to convolution arithmetic for deep learning , 2016, ArXiv.
[33] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[34] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[35] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.
[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[38] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[39] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[41] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[42] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[43] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[45] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[46] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[47] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[48] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[49] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[50] Carl E. Rasmussen,et al. In Advances in Neural Information Processing Systems , 2011 .
[51] Yu Wang,et al. Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[52] Erik Learned-Miller,et al. FDDB: A benchmark for face detection in unconstrained settings , 2010 .