LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism
暂无分享,去创建一个
Tianqi Wang | Ang Li | Tong Geng | Shuaiwen Leon Song | Martin Herbordt | Chen Yang | Chunshu Wu | Tong Geng | Ang Li | Tianqi Wang | Chunshu Wu | M. Herbordt | Shuaiwen Song | Chen Yang | S. Song
[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[2] Eugenio Culurciello,et al. An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.
[3] Akash Kumar,et al. Dataflow-Based Mapping of Spiking Neural Networks on Neuromorphic Hardware , 2018, ACM Great Lakes Symposium on VLSI.
[4] Martin C. Herbordt,et al. O3BNN: an out-of-order architecture for high-performance binarized neural network inference with fine-grained pruning , 2019, ICS.
[5] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[6] Gang Hua,et al. How to Train a Compact Binary Neural Network with High Accuracy? , 2017, AAAI.
[7] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.
[8] Chen Yang,et al. Novo-G#: Large-scale reconfigurable computing with direct and programmable interconnects , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).
[9] Chen Yang,et al. FPDeep: Acceleration and Load Balancing of CNN Training on FPGA Clusters , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[10] Luca Benini,et al. YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[11] Jiangming Jin,et al. BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[12] Igor Carron,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .
[13] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.
[14] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.
[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[16] Martin C. Herbordt,et al. A Framework for Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters with Work and Weight Load Balancing , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[17] Wei Pan,et al. Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.
[18] Rajesh Gupta,et al. Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs , 2017, FPGA.
[19] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.
[20] Eriko Nurvitadhi,et al. Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC , 2016, 2016 International Conference on Field-Programmable Technology (FPT).
[21] Wayne Luk,et al. FP-BNN: Binarized neural network on FPGA , 2018, Neurocomputing.
[22] Martin C. Herbordt,et al. A Scalable Framework for Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters with Weight and Workload Balancing , 2019, ArXiv.