Octo: INT8 Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning
暂无分享,去创建一个
Jingren Zhou | Qihua Zhou | Zhihao Qu | Boyuan Luo | Jingcai Guo | Zhenda Xu | Song Guo | Jiewei Zhang | Tao Guo | Jingren Zhou | Jiewei Zhang | Song Guo | Zhenda Xu | Tao Guo | Zhihao Qu | Jingcai Guo | Qihua Zhou | Boyuan Luo
[1] Yurong Chen,et al. Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.
[2] Elad Hoffer,et al. ACIQ: Analytical Clipping for Integer Quantization of neural networks , 2018, ArXiv.
[3] Hiroki Matsutani,et al. A Neural Network-Based On-Device Learning Anomaly Detector for Edge Devices , 2019, IEEE Transactions on Computers.
[4] Swagath Venkataramani,et al. PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.
[5] Elad Hoffer,et al. Scalable Methods for 8-bit Training of Neural Networks , 2018, NeurIPS.
[6] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[7] Wencong Xiao,et al. Gandiva: Introspective Cluster Scheduling for Deep Learning , 2018, OSDI.
[8] Zhijian Liu,et al. HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Yong Zhao,et al. Binarized Neural Networks on the ImageNet Classification Task , 2016 .
[10] Joos Vandewalle,et al. A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..
[11] Nicholas D. Lane,et al. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware , 2017, MobiSys.
[12] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[13] Steven Skiena,et al. DeepWalk: online learning of social representations , 2014, KDD.
[14] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[15] Yaoliang Yu,et al. Distributed Machine Learning via Sufficient Factor Broadcasting , 2015, ArXiv.
[16] Jaesik Choi,et al. HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism , 2020, USENIX ATC.
[17] Vikas Chandra,et al. CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs , 2018, ArXiv.
[18] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.
[19] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[20] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[21] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[22] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.
[23] Wonyong Sung,et al. Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..
[24] Yibo Zhu,et al. A generic communication scheduler for distributed DNN training acceleration , 2019, SOSP.
[25] Xu Chen,et al. Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing , 2019, Proceedings of the IEEE.
[26] Rajendra Akerkar,et al. On-Device Learning Systems for Edge Intelligence: A Software and Hardware Synergy Perspective , 2021, IEEE Internet of Things Journal.
[27] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[29] Markus Nagel,et al. Data-Free Quantization Through Weight Equalization and Bias Correction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Jacek M. Zurada,et al. Feature Selection for Neural Networks Using Group Lasso Regularization , 2020, IEEE Transactions on Knowledge and Data Engineering.
[31] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[32] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[33] Jae-Joon Han,et al. Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[35] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[36] Yaoliang Yu,et al. Lighter-Communication Distributed Machine Learning via Sufficient Factor Broadcasting , 2016, UAI.
[37] Song Han,et al. DSD: Dense-Sparse-Dense Training for Deep Neural Networks , 2016, ICLR.
[38] Rémi Gribonval,et al. And the Bit Goes Down: Revisiting the Quantization of Neural Networks , 2019, ICLR.
[39] Xianglong Liu,et al. Towards Unified INT8 Training for Convolutional Neural Network , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Song Han,et al. TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning , 2020, NeurIPS.
[41] Song Han,et al. Exploring the Regularity of Sparse Structure in Convolutional Neural Networks , 2017, ArXiv.
[42] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[43] Xin Wang,et al. SkipNet: Learning Dynamic Routing in Convolutional Networks , 2017, ECCV.
[44] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[45] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[46] H. T. Kung,et al. Full-stack optimization for accelerating CNNs using powers-of-two weights with FPGA validation , 2019, ICS.
[47] Pengtao Xie,et al. Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters , 2017, USENIX Annual Technical Conference.
[48] Xin Dong,et al. Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.
[49] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[50] Wei Wang,et al. Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks , 2020, ICLR.
[51] Xiao Zeng,et al. NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision , 2018, MobiCom.
[52] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[53] Max Welling,et al. Relaxed Quantization for Discretized Neural Networks , 2018, ICLR.
[54] Song Han,et al. MCUNet: Tiny Deep Learning on IoT Devices , 2020, NeurIPS.
[55] Yoshua Bengio,et al. Neural Networks with Few Multiplications , 2015, ICLR.