Scaling the Cascades: Interconnect-Aware FPGA Implementation of Machine Learning Problems
暂无分享,去创建一个
Nachiket Kapre | Tushar Krishna | Ananda Samajdar | Tushar Garg | N. Kapre | T. Krishna | A. Samajdar | Tushar Garg | Nachiket Kapre
[1] Xiaoqian Zhang,et al. Compute-Efficient Neural-Network Acceleration , 2019, FPGA.
[2] Vaughn Betz,et al. Embracing Diversity: Enhanced DSP Blocks for Low-Precision Deep Learning on FPGAs , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[3] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[4] Soheil Ghiasi,et al. PLACID: A Platform for FPGA-Based Accelerator Creation for DCNNs , 2017, ACM Trans. Multim. Comput. Commun. Appl..
[5] Christos-Savvas Bouganis,et al. Latency-driven design for FPGA-based convolutional neural networks , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[6] Yu Cao,et al. An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[7] Inkeun Cho,et al. A high-throughput reconfigurable processing array for neural networks , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[8] M. Pelcat,et al. Tactics to Directly Map CNN Graphs on Embedded FPGAs , 2017, IEEE Embedded Systems Letters.
[9] Shijie Li,et al. Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks , 2017, ACM Trans. Reconfigurable Technol. Syst..
[10] Peng Zhang,et al. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).
[11] Nachiket Kapre,et al. Implementing FPGA Overlay NoCs Using the Xilinx UltraScale Memory Cascades , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[12] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.
[13] Yu Cao,et al. Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks , 2017, FPGA.
[14] Eriko Nurvitadhi,et al. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? , 2017, FPGA.
[15] Jason Cong,et al. Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[16] Asit K. Mishra,et al. From high-level deep neural models to FPGAs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[17] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[18] Jason Cong,et al. Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster , 2016, ISLPED.
[19] Yu Cao,et al. Scalable and modularized RTL compilation of Convolutional Neural Networks onto FPGA , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[20] Xuegong Zhou,et al. A high performance FPGA-based accelerator for large-scale convolutional neural networks , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[21] Michael Ferdman,et al. Overcoming resource underutilization in spatial CNN accelerators , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[22] Vivienne Sze,et al. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[23] Bin Liu,et al. Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Christos-Savvas Bouganis,et al. fpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[25] Kiyoung Choi,et al. Efficient FPGA acceleration of Convolutional Neural Networks using logical-3D compute array , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[26] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.
[27] Yu Cao,et al. Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.
[28] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[29] Nick Mehta. Pushing Performance and Integration with the UltraScale + Portfolio , 2015 .