Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks
暂无分享,去创建一个
[1] Francesco Conti,et al. Free Bits: Latency Optimization of Mixed-Precision Quantized Neural Networks on the Edge , 2023, 2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS).
[2] E. Macii,et al. Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference , 2023, ArXiv.
[3] Francesco Conti,et al. DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training , 2023, IEEE Open Journal of the Solid-State Circuits Society.
[4] H. Corporaal,et al. BrainTTA: A 35 fJ/op Compiler Programmable Mixed-Precision Transport-Triggered NN SoC , 2022, ArXiv.
[5] D. Golovin,et al. Open Source Vizier: Distributed Infrastructure and API for Reliable and Flexible Blackbox Optimization , 2022, AutoML.
[6] Yutong Lu,et al. moTuner: a compiler-based auto-tuning approach for mixed-precision operators , 2022, CF.
[7] Yichi Zhang,et al. PokeBNN: A Binary Pursuit of Lightweight Accuracy , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Gert Cauwenberghs,et al. Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory , 2021, ArXiv.
[9] Alexander Kolesnikov,et al. Scaling Vision Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Cho-Jui Hsieh,et al. When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations , 2021, ICLR.
[11] Junghyup Lee,et al. Network Quantization with Element-wise Gradient Scaling , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Berkin Akin,et al. An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks , 2021, 2022 IEEE International Symposium on Workload Characterization (IISWC).
[13] William J. Dally,et al. VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference , 2021, MLSys.
[14] Kurt Keutzer,et al. HAWQV3: Dyadic Neural Network Quantization , 2020, ICML.
[15] Hieu Duy Nguyen,et al. Quantization Aware Training with Absolute-Cosine Regularization for Automatic Speech Recognition , 2020, INTERSPEECH.
[16] Eunhyeok Park,et al. PROFIT: A Novel Training Method for sub-4-bit MobileNet Models , 2020, ECCV.
[17] Nojun Kwak,et al. Position-based Scaled Gradient for Model Quantization and Pruning , 2020, NeurIPS.
[18] Jinwon Lee,et al. LSQ+: Improving low-bit quantization through learnable offsets and better initialization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[19] Michael W. Mahoney,et al. PyHessian: Neural Networks Through the Lens of the Hessian , 2019, 2020 IEEE International Conference on Big Data (Big Data).
[20] Aaron C. Courville,et al. What Do Compressed Deep Neural Networks Forget , 2019, 1911.05248.
[21] Michael W. Mahoney,et al. HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks , 2019, NeurIPS.
[22] Vivienne Sze,et al. Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[23] M. Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[24] Xianglong Liu,et al. Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Yiying Zhang,et al. "Learned" , 2019, ACM SIGOPS Operating Systems Review.
[26] T. Kemp,et al. Mixed Precision DNNs: All you need is a good parametrization , 2019, ICLR.
[27] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[28] Steven K. Esser,et al. Learned Step Size Quantization , 2019, ICLR.
[29] Zhijian Liu,et al. HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Hadi Esmaeilzadeh,et al. ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks , 2018, ArXiv.
[31] Daniel Soudry,et al. Post training 4-bit quantization of convolutional networks for rapid-deployment , 2018, NeurIPS.
[32] Jae-Joon Han,et al. Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[34] Kurt Keutzer,et al. SqueezeNext: Hardware-Aware Neural Network Design , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[35] Swagath Venkataramani,et al. PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.
[36] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[38] D. Sculley,et al. Google Vizier: A Service for Black-Box Optimization , 2017, KDD.
[39] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[40] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[42] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[43] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[44] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[45] Marian Verhelst,et al. Survey and Benchmarking of Precision-Scalable MAC Arrays for Embedded DNN Processing , 2021, ArXiv.