FPDeep: Acceleration and Load Balancing of CNN Training on FPGA Clusters
暂无分享,去创建一个
Chen Yang | Rui Xu | Martin C. Herbordt | Tianqi Wang | Ahmed Sanaullah | Rushi Patel | Tong Geng | Tong Geng | Tianqi Wang | M. Herbordt | Rui Xu | Chen Yang | A. Sanaullah | Rushi Patel
[1] Guangwen Yang,et al. F-CNN: An FPGA-based framework for training Convolutional Neural Networks , 2016, 2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[2] Nachiket Kapre,et al. CaffePresso: Accelerating Convolutional Networks on Embedded SoCs , 2017, ACM Trans. Embed. Comput. Syst..
[3] Eriko Nurvitadhi,et al. High performance binary neural networks on the Xeon+FPGA™ platform , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[4] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[5] Jason Cong,et al. Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster , 2016, ISLPED.
[6] Xi Chen,et al. FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[7] Ruo Long Lian. A framework for FPGA-based acceleration of neural network inference with limited numerical precision via high-level synthesis with streaming functionality , 2016 .