暂无分享,去创建一个
[1] Asit K. Mishra,et al. Low Precision RNNs: Quantizing RNNs Without Losing Accuracy , 2017, ArXiv.
[2] Yvon Savaria,et al. Bit-Slicing FPGA Accelerator for Quantized Neural Networks , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).
[3] Swagath Venkataramani,et al. Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks , 2019, NeurIPS.
[4] Babak Falsafi,et al. Training DNNs with Hybrid Block Floating Point , 2018, NeurIPS.
[5] Patrick Judd,et al. Stripes: Bit-serial deep neural network computing , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[6] H. T. Kung. Why systolic architectures? , 1982, Computer.
[7] Brian Chmiel,et al. Neural gradients are near-lognormal: improved quantized and sparse training , 2020, ICLR.
[8] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[9] Alessandro Forin,et al. Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point , 2020, NeurIPS.
[10] Swagath Venkataramani,et al. A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference , 2020, 2020 IEEE Symposium on VLSI Circuits.
[11] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[12] Joel Silberman,et al. A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference , 2018, 2018 IEEE Symposium on VLSI Circuits.
[13] Song Han,et al. Trained Ternary Quantization , 2016, ICLR.
[14] David Patterson,et al. A domain-specific supercomputer for training deep neural networks , 2020, Commun. ACM.
[15] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[16] Eunhyeok Park,et al. Weighted-Entropy-Based Quantization for Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Yoshua Bengio,et al. Training deep neural networks with low precision multiplications , 2014 .
[19] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[20] Dipankar Das,et al. SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[22] Michael J. Schulte,et al. Design alternatives for barrel shifters , 2002, SPIE Optics + Photonics.
[23] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[24] Elad Hoffer,et al. Scalable Methods for 8-bit Training of Neural Networks , 2018, NeurIPS.
[25] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[26] Xiaomei Yang. Rounding Errors in Algebraic Processes , 1964, Nature.
[27] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .
[28] Kunle Olukotun,et al. DAWNBench : An End-to-End Deep Learning Benchmark and Competition , 2017 .
[29] Seungkyu Choi,et al. An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices , 2020, IEEE Journal of Solid-State Circuits.
[30] Quoc V. Le,et al. Adding Gradient Noise Improves Learning for Very Deep Networks , 2015, ArXiv.
[31] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[32] Guy Lemieux,et al. Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[33] Tao Li,et al. Eager Pruning: Algorithm and Architecture Support for Fast Training of Deep Neural Networks , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[34] H. T. Kung,et al. Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization , 2018, ASPLOS.
[35] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[36] Andreas Moshovos,et al. TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[37] Xin Wang,et al. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks , 2017, NIPS.