Low-Power Approximate Unsigned Multipliers With Configurable Error Recovery

Approximate circuits have been considered for applications that can tolerate some loss of accuracy with improved performance and/or energy efficiency. Multipliers are key arithmetic circuits in many of these applications including digital signal processing (DSP). In this paper, a novel approximate multiplier with a low power consumption and a short critical path is proposed for high-performance DSP applications. This multiplier leverages a newly designed approximate adder that limits its carry propagation to the nearest neighbors for fast partial product accumulation. Different levels of accuracy can be achieved by using either OR gates or the proposed approximate adder in a configurable error recovery circuit. The approximate multipliers using these two error reduction strategies are referred to as AM1 and AM2, respectively. Both AM1 and AM2 have a low mean error distance, i.e., most of the errors are not significant in magnitude. Compared with a Wallace multiplier optimized for speed, an $8\times 8$ AM1 using four most significant bits for error reduction shows a 60% reduction in delay (when optimized for delay) and a 42% reduction in power dissipation (when optimized for area). In a $16\times 16$ design, half of the least significant partial products are truncated for AM1 and AM2, which are thus denoted as TAM1 and TAM2, respectively. Compared with the Wallace multiplier, TAM1 and TAM2 save from 50% to 66% in power, when optimized for area. Compared with existing approximate multipliers, AM1, AM2, TAM1, and TAM2 show significant advantages in accuracy with a low power-delay product. AM2 has a better accuracy compared with AM1 but with a longer delay and higher power consumption. Image processing applications, including image sharpening and smoothing, are considered to show the quality of the approximate multipliers in error-tolerant applications. By utilizing an appropriate error recovery scheme, the proposed approximate multipliers achieve similar processing accuracy as exact multipliers, but with significant improvements in power.

[1]  Peter J. Varman,et al.  High performance reliable variable latency carry select addition , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Fabrizio Lombardi,et al.  Approximate Radix-8 Booth Multipliers for Low-Power and High-Performance Operation , 2016, IEEE Transactions on Computers.

[3]  Kaushik Roy,et al.  MACACO: Modeling and analysis of circuits for approximate computing , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[4]  Melvin A. Breuer,et al.  Intelligible test techniques to support error-tolerance , 2004, 13th Asian Test Symposium.

[5]  Taejoon Park,et al.  Energy-Efficient Approximate Multiplication for Digital Signal Processing and Classification Applications , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Kartikeya Bhardwaj,et al.  Power- and area-efficient Approximate Wallace Tree Multiplier for error-resilient systems , 2014, Fifteenth International Symposium on Quality Electronic Design.

[7]  Fabrizio Lombardi,et al.  New Metrics for the Reliability of Approximate and Probabilistic Adders , 2013, IEEE Transactions on Computers.

[8]  Kiat Seng Yeo,et al.  Low-power high-speed multiplier for error-tolerant application , 2010, 2010 IEEE International Conference of Electron Devices and Solid-State Circuits (EDSSC).

[9]  Shih-Lien Lu Speeding Up Processing with Approximation Circuits , 2004, Computer.

[10]  Mircea Vladutiu,et al.  Computer Arithmetic , 2012, Springer Berlin Heidelberg.

[11]  Kaushik Roy,et al.  IMPACT: IMPrecise adders for low-power approximate computing , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[12]  Gang Wang,et al.  Enhanced low-power high-speed adder for error-tolerant application , 2009, 2010 International SoC Design Conference.

[13]  Mark S. K. Lau,et al.  Energy-aware probabilistic multiplier: design and analysis , 2009, CASES '09.

[14]  Sherief Reda,et al.  ABACUS: A technique for automated behavioral synthesis of approximate computing circuits , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Sung-Mo Kang,et al.  Electrothermal Analysis of VLSI Systems , 2000 .

[16]  Peng Li,et al.  Array-Based Approximate Arithmetic Computing: A General Model and Applications to Multiplier and Squarer Design , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.

[17]  Kaushik Roy,et al.  ASLAN: Synthesis of approximate sequential circuits , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[18]  Earl E. Swartzlander,et al.  Analysis of column compression multipliers , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[19]  Earl E. Swartzlander,et al.  Computer Arithmetic , 1980 .

[20]  Andrew B. Kahng,et al.  Accuracy-configurable adder for approximate arithmetic designs , 2012, DAC Design Automation Conference 2012.

[21]  Puneet Gupta,et al.  Trading Accuracy for Power in a Multiplier Architecture , 2011, J. Low Power Electron..

[22]  Fabrizio Lombardi,et al.  A low-power, high-performance approximate multiplier with configurable partial error recovery , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[23]  Jie Han,et al.  Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[24]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[25]  Arthur Robert Weeks,et al.  The Pocket Handbook of Image Processing Algorithms In C , 1993 .

[26]  Robert C. Wolpert,et al.  A Review of the , 1985 .

[27]  Earl E. Swartzlander,et al.  Parallel reduced area multipliers , 1995, J. VLSI Signal Process..

[28]  Vojin G. Oklobdzija,et al.  A Method for Speed Optimized Partial Product Reduction and Generation of Fast Parallel Multipliers Using an Algorithmic Approach , 1996, IEEE Trans. Computers.

[29]  Paolo Ienne,et al.  Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design , 2008, 2008 Design, Automation and Test in Europe.

[30]  John Lach,et al.  A methodology for energy-quality tradeoff using imprecise hardware , 2012, DAC Design Automation Conference 2012.

[31]  Caro Lucas,et al.  Bio-Inspired Imprecise Computational Blocks for Efficient VLSI Implementation of Soft-Computing Applications , 2010, IEEE Transactions on Circuits and Systems I: Regular Papers.

[32]  Tsin-Yuan Chang,et al.  A High-Accuracy Adaptive Conditional-Probability Estimator for Fixed-Width Booth Multipliers , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[33]  Fabrizio Lombardi,et al.  A Review, Classification, and Comparative Evaluation of Approximate Arithmetic Circuits , 2017, ACM J. Emerg. Technol. Comput. Syst..

[34]  Ku He,et al.  Modeling and synthesis of quality-energy optimal approximate adders , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[35]  E. J. King,et al.  Data-dependent truncation scheme for parallel multipliers , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).