Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors
暂无分享,去创建一个
Adrian Alan Pol | Maurizio Pierini | Hao Zhuang | Shan Li | Vladimir Loncar | Sioni Summers | Claudionor N. Coelho | Aki Kuusela | Thea Aarrestad | Jennifer Ngadiuba | V. Loncar | J. Ngadiuba | M. Pierini | S. Summers | Hao Zhuang | T. Aarrestad | Shane Li | A. A. Pol | Aki Kuusela | C. Coelho | Zhuang Hao
[1] G. Hua,et al. LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks , 2018, ECCV.
[2] Kyuyeon Hwang,et al. Fixed-point feedforward deep neural network design using weights +1, 0, and −1 , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).
[3] Philip Harris,et al. Compressing deep neural networks on FPGAs to binary and ternary precision with hls4ml , 2020, Mach. Learn. Sci. Technol..
[4] Pradeep Dubey,et al. Mixed Precision Training of Convolutional Neural Networks using Integer Operations , 2018, ICLR.
[5] Christos-Savvas Bouganis,et al. fpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[6] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[7] Lingjia Tang,et al. The Architectural Implications of Autonomous Driving: Constraints and Acceleration , 2018, ASPLOS.
[8] Daniel Brand,et al. Training Deep Neural Networks with 8-bit Floating Point Numbers , 2018, NeurIPS.
[9] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[10] Ke Wang,et al. AI Benchmark: Running Deep Neural Networks on Android Smartphones , 2018, ECCV Workshops.
[11] Ahmad Shawahna,et al. FPGA-Based Accelerators of Deep Learning Networks for Learning and Classification: A Review , 2019, IEEE Access.
[12] Ameet Talwalkar,et al. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..
[13] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[14] Eric A. Moreno,et al. JEDI-net: a jet identification algorithm based on interaction networks , 2019, The European Physical Journal C.
[15] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.
[16] Xi Chen,et al. FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Ji Liu,et al. Global Sparse Momentum SGD for Pruning Very Deep Neural Networks , 2019, NeurIPS.
[19] Zhiru Zhang,et al. Improving Neural Network Quantization without Retraining using Outlier Channel Splitting , 2019, ICML.
[20] Hon Keung Kwan,et al. A design method for multilayer feedforward neural networks for simple hardware implementation , 1993, 1993 IEEE International Symposium on Circuits and Systems.
[21] Kenneth O'Brien,et al. FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks , 2018 .
[22] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .
[23] Xuegong Zhou,et al. A high performance FPGA-based accelerator for large-scale convolutional neural networks , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[24] Kurt Keutzer,et al. HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Christos-Savvas Bouganis,et al. fpgaConvNet: Automated Mapping of Convolutional Neural Networks on FPGAs (Abstract Only) , 2017, FPGA.
[26] Markus Nagel,et al. Data-Free Quantization Through Weight Equalization and Bias Correction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] Asit K. Mishra,et al. From high-level deep neural models to FPGAs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[28] Eugenio Culurciello,et al. Snowflake: An efficient hardware accelerator for convolutional neural networks , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).
[29] Bin Liu,et al. Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[31] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[32] Christos-Savvas Bouganis,et al. Toolflows for Mapping Convolutional Neural Networks on FPGAs , 2018, ACM Comput. Surv..
[33] Philip Harris,et al. Fast convolutional neural networks on FPGAs with hls4ml , 2021, Machine Learning: Science and Technology.
[34] Marian Verhelst,et al. Minimum energy quantized neural networks , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.
[35] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[36] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[37] Ian D. Reid,et al. Towards Effective Low-Bitwidth Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[38] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[39] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[40] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[41] Lucio Rossi,et al. High-Luminosity Large Hadron Collider (HL-LHC) : Preliminary Design Report , 2015 .
[42] Kurt Keutzer,et al. HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks , 2020, NeurIPS.
[43] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[45] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.
[46] Kenneth O'Brien,et al. FINN-R , 2018, ACM Trans. Reconfigurable Technol. Syst..
[47] Daniel Soudry,et al. Post training 4-bit quantization of convolutional networks for rapid-deployment , 2018, NeurIPS.
[48] YU WANG,et al. A Survey of FPGA-Based Neural Network Inference Accelerator , 2019 .
[49] Yuandong Tian,et al. Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search , 2018, ArXiv.
[50] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[51] Alexander Finkelstein,et al. Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization , 2019, ICML.
[52] Gianluca Cerminara,et al. Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics , 2020, Frontiers in Big Data.
[53] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[54] Wayne Luk,et al. Hardware Compilation of Deep Neural Networks: An Overview , 2018, 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[55] Xiangyu Zhang,et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.
[56] Song Han,et al. Fast inference of deep neural networks in FPGAs for particle physics , 2018, Journal of Instrumentation.
[57] Christos-Savvas Bouganis,et al. fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs , 2017, ArXiv.
[58] Yash Akhauri,et al. LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications , 2020, 2020 30th International Conference on Field-Programmable Logic and Applications (FPL).
[59] Maxime Pelcat,et al. Accelerating CNN inference on FPGAs: A Survey , 2018, ArXiv.
[60] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[61] Song Han,et al. HAQ: Hardware-Aware Automated Quantization , 2018, ArXiv.
[62] Heiner Litz,et al. High Frequency Trading Acceleration Using FPGAs , 2011, 2011 21st International Conference on Field Programmable Logic and Applications.