Approximate Computing for ML: State-of-the-art, Challenges and Visions

In this paper, we present our state-of-the-art approximate techniques that cover the main pillars of approximate computing research. Our analysis considers both static and reconfigurable approximation techniques as well as operation-specific approximate components (e.g., multipliers) and generalized approximate high-level synthesis approaches. As our application target, we discuss the improvements that such techniques bring on machine learning and neural networks. In addition to the conventionally analyzed performance and energy gains, we also evaluate the improvements that approximate computing brings in the operating temperature.

[1]  Kaushik Roy,et al.  Energy-Efficient Neural Computing with Approximate Multipliers , 2018, ACM J. Emerg. Technol. Comput. Syst..

[2]  Arnab Raha,et al.  Towards full-system energy-accuracy tradeoffs: A case study of an approximate smart camera system? , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[3]  P. Stanley-Marbell,et al.  Exploiting Errors for Efficiency: A Survey from Circuits to Applications , 2020 .

[4]  Ku He,et al.  Controlled timing-error acceptance for low energy IDCT design , 2011, 2011 Design, Automation & Test in Europe.

[5]  Taejoon Park,et al.  Energy-Efficient Approximate Multiplication for Digital Signal Processing and Classification Applications , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Andreas Gerstlauer,et al.  Statistical quality modeling of approximate hardware , 2016, 2016 17th International Symposium on Quality Electronic Design (ISQED).

[7]  Fabrizio Lombardi,et al.  A Review, Classification, and Comparative Evaluation of Approximate Arithmetic Circuits , 2017, ACM J. Emerg. Technol. Comput. Syst..

[8]  Andreas Gerstlauer,et al.  High-level synthesis of approximate hardware under joint precision and voltage scaling , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[9]  Jay Lewis,et al.  Superlattice-based thin-film thermoelectric modules with high cooling fluxes , 2016, Nature Communications.

[10]  Kiat Seng Yeo,et al.  Low-power high-speed multiplier for error-tolerant application , 2010, 2010 IEEE International Conference of Electron Devices and Solid-State Circuits (EDSSC).

[11]  Muhammad Shafique,et al.  A low latency generic accuracy configurable adder , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Kartikeya Bhardwaj,et al.  Power- and area-efficient Approximate Wallace Tree Multiplier for error-resilient systems , 2014, Fifteenth International Symposium on Quality Electronic Design.

[13]  Puneet Gupta,et al.  Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[14]  Jörg Henkel,et al.  Design Automation of Approximate Circuits With Runtime Reconfigurable Accuracy , 2020, IEEE Access.

[15]  Andreas Gerstlauer,et al.  Fine grain word length optimization for dynamic precision scaling in DSP systems , 2013, 2013 IFIP/IEEE 21st International Conference on Very Large Scale Integration (VLSI-SoC).

[16]  E. V. Krishnamurthy,et al.  On Computer Multiplication and Division Using Binary Logarithms , 1963, IEEE Transactions on Electronic Computers.

[17]  Andreas Gerstlauer,et al.  Runtime Accuracy-Configurable Approximate Hardware Synthesis Using Logic Gating and Relaxation , 2020, 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[18]  Iraklis Anagnostopoulos,et al.  Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.

[19]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[20]  Bruce F. Cockburn,et al.  A Hardware-Efficient Logarithmic Multiplier with Improved Accuracy , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[21]  Kiyoung Choi,et al.  Aging Compensation With Dynamic Computation Approximation , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.

[22]  Sherief Reda,et al.  DRUM: A Dynamic Range Unbiased Multiplier for approximate applications , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[23]  Sri Parameswaran,et al.  REALM: Reduced-Error Approximate Log-based Integer Multiplier , 2020, 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[24]  Semeen Rehman,et al.  Architectural-space exploration of approximate multipliers , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[25]  Kaushik Roy,et al.  Approximate computing and the quest for computing efficiency , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[26]  Wei Zhang,et al.  A low-power accuracy-configurable floating point multiplier , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[27]  Jörg Henkel,et al.  Trading Off Temperature Guardbands via Adaptive Approximations , 2018, 2018 IEEE 36th International Conference on Computer Design (ICCD).

[28]  Ku He,et al.  Modeling and synthesis of quality-energy optimal approximate adders , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[29]  Kostas Siozios,et al.  VADER: Voltage-Driven Netlist Pruning for Cross-Layer Approximate Arithmetic Circuits , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[30]  Iraklis Anagnostopoulos,et al.  NPU Thermal Management , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[31]  Andreas Gerstlauer,et al.  Approximate High-Level Synthesis of Custom Hardware , 2018, Approximate Circuits.

[32]  Sri Parameswaran,et al.  Minimally Biased Multipliers for Approximate Integer and Floating-Point Multiplication , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[33]  Andreas Gerstlauer,et al.  Data-Dependent Loop Approximations for Performance-Quality Driven High-Level Synthesis , 2018, IEEE Embedded Systems Letters.

[34]  Rob A. Rutenbar,et al.  Reducing power by optimizing the necessary precision/range of floating-point arithmetic , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[35]  Swagath Venkataramani,et al.  Compensated-DNN: Energy Efficient Low-Precision Deep Neural Networks by Compensating Quantization Errors , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[36]  Dimitrios Soudris,et al.  Multi-Level Approximate Accelerator Synthesis Under Voltage Island Constraints , 2019, IEEE Transactions on Circuits and Systems II: Express Briefs.