CNN Inference Using a Preprocessing Precision Controller and Approximate Multipliers With Various Precisions
暂无分享,去创建一个
[1] Lin Yang,et al. Ultra Power-Efficient CNN Domain Specific Accelerator with 9.3TOPS/Watt for Mobile and Embedded Applications , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[2] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[3] Luca Benini,et al. Origami: A Convolutional Network Accelerator , 2015, ACM Great Lakes Symposium on VLSI.
[4] Dimitrios Soudris,et al. Design-Efficient Approximate Multiplication Circuits Through Partial Product Perforation , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[5] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[6] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[7] Jingyu Wang,et al. STICKER: An Energy-Efficient Multi-Sparsity Compatible Accelerator for Convolutional Neural Networks in 65-nm CMOS , 2020, IEEE Journal of Solid-State Circuits.
[8] Kiamal Z. Pekmestzi,et al. Cooperative Arithmetic-Aware Approximation Techniques for Energy-Efficient Multipliers , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[9] Bruce F. Cockburn,et al. Improving the Accuracy and Hardware Efficiency of Neural Networks Using Approximate Multipliers , 2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[10] Qiang Xu,et al. Approximate Computing: A Survey , 2016, IEEE Design & Test.
[11] Mehdi Kamal,et al. RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efficient Digital Signal Processing , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[12] Xueliang Zhang,et al. Design of 16-bit fixed-point CNN coprocessor based on FPGA , 2018, 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP).
[13] Xuegong Zhou,et al. A high performance FPGA-based accelerator for large-scale convolutional neural networks , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[14] Shihui Yin,et al. A 2.6 TOPS/W 16-Bit Fixed-Point Convolutional Neural Network Learning Processor in 65-nm CMOS , 2020, IEEE Solid-State Circuits Letters.
[15] Abien Fred Agarap. Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.
[16] Yu Wang,et al. Towards Real-Time Object Detection on Embedded Systems , 2018, IEEE Transactions on Emerging Topics in Computing.
[17] Iraklis Anagnostopoulos,et al. Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.
[18] Fabrizio Lombardi,et al. Design of Approximate Radix-4 Booth Multipliers for Error-Tolerant Computing , 2017, IEEE Transactions on Computers.
[19] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[20] Mehdi Kamal,et al. TOSAM: An Energy-Efficient Truncation- and Rounding-Based Scalable Approximate Multiplier , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[21] Jie Han,et al. Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).
[22] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Kamal El-Sankary,et al. Impact of Approximate Multipliers on VGG Deep Learning Network , 2018, IEEE Access.
[24] Zhenyu Liu,et al. High-Performance FPGA-Based CNN Accelerator With Block-Floating-Point Arithmetic , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[25] Taejoon Park,et al. Energy-Efficient Approximate Multiplication for Digital Signal Processing and Classification Applications , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[26] Kaushik Roy,et al. Design of power-efficient approximate multipliers for approximate artificial neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[27] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[28] Seok-Bum Ko,et al. Design of Power and Area Efficient Approximate Multipliers , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[29] Sparsh Mittal,et al. A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..
[30] Kaushik Roy,et al. Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[31] Peng Zhang,et al. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).
[32] Jason Gu,et al. Deep Learning Training with Simulated Approximate Multipliers , 2019, 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO).
[33] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Sherief Reda,et al. DRUM: A Dynamic Range Unbiased Multiplier for approximate applications , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[35] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[36] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.
[37] Taejoon Park,et al. SiMul: An Algorithm-Driven Approximate Multiplier Design for Machine Learning , 2018, IEEE Micro.