Low-Power Unsigned Divider and Square Root Circuit Designs Using Adaptive Approximation

In this paper, an adaptive approximation approach is proposed for the design of a divider and a square root (SQR) circuit. In this design, the division/SQR is computed by using a reduced-width divider/SQR circuit and a shifter by adaptively pruning some insignificant input bits. Specifically, for a $2n/n$2n/n division, $2k$2k and $k$k ($k<n$k<n) consecutive bits are selected starting from the most significant ‘1’ in the dividend and divisor, respectively. At the same time, redundant least significant bits (LSBs) are truncated or if the number of remaining bits after pruning is smaller than the number of bits to be kept, ‘0's are appended to the LSBs of the inputs. To avoid overflow, a $2(k+1)/(k+1)$2(k+1)/(k+1) divider is used to compute the $2k/k$2k/k division. Finally, an error correction circuit is proposed to recover the error caused by the shifter using OR gates. For a $2n$2n-bit approximate SQR circuit, similar pruning schemes are used to obtain a $2k$2k-bit radicand. A $2k$2k-bit SQR circuit and a shifter are then utilized to compute the SQR. This adaptive operation leads to very small maximum error distances of the approximate divider and SQR circuits, as shown by a theoretical error analysis. The proposed 16/8 approximate divider using an 8/4 exact array divider is $2.5\times$2.5× as fast but only consumes 34.42 percent of the power of the accurate design. Compared to the accurate 16-bit array SQR circuit, the approximate design with a 6-bit radicand is $3.9\times$3.9× as fast and consumes 20.66 percent of the power. The approximate SQR circuit using a 6-bit lookup table-based SQR circuit consumes 7.15 percent of the power of its corresponding accurate design. The proposed designs outperform other approximate designs in image processing applications including change detection (for the divider), envelope detection (for the SQR circuit) and image reconstruction (for both designs).

[1]  Fabrizio Lombardi,et al.  A Review, Classification, and Comparative Evaluation of Approximate Arithmetic Circuits , 2017, ACM J. Emerg. Technol. Comput. Syst..

[2]  Fabrizio Lombardi,et al.  Design, Evaluation and Application of Approximate High-Radix Dividers , 2018, IEEE Transactions on Multi-Scale Computing Systems.

[3]  Jie Han,et al.  Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[4]  Michael J. Flynn On Division by Functional Iteration , 1970, IEEE Transactions on Computers.

[5]  Jan Fandrianto Algorithm for high speed shared radix 4 division and radix 4 square-root , 1987, 1987 IEEE 8th Symposium on Computer Arithmetic (ARITH).

[6]  K. Boone,et al.  Effect of skin impedance on image quality and variability in electrical impedance tomography: a model study , 1996, Medical and Biological Engineering and Computing.

[7]  J. Jensen,et al.  Calculation of pressure fields from arbitrarily shaped, apodized, and excited ultrasound transducers , 1992, IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control.

[8]  Fabrizio Lombardi,et al.  On the Design of Approximate Restoring Dividers for Error-Tolerant Applications , 2016, IEEE Transactions on Computers.

[9]  Fabrizio Lombardi,et al.  Design of Approximate High-Radix Dividers by Inexact Binary Signed-Digit Addition , 2017, ACM Great Lakes Symposium on VLSI.

[10]  Nitin Chandrachoodan,et al.  FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture , 2012, IEEE Transactions on Computers.

[11]  Sherief Reda,et al.  A low-power dynamic divider for approximate applications , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Wei Liu,et al.  Power Efficient Division and Square Root Unit , 2012, IEEE Transactions on Computers.

[13]  Fabrizio Lombardi,et al.  Design of Approximate Unsigned Integer Non-restoring Divider for Inexact Computing , 2015, ACM Great Lakes Symposium on VLSI.

[14]  Poras T. Balsara,et al.  VLSI Architecture for Matrix Inversion using Modified Gram-Schmidt based QR Decomposition , 2007, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07).

[15]  Arnaud Tisserand,et al.  Reciprocation, square root, inverse square root, and some elementary functions using small multipliers , 1998, Optics & Photonics.

[16]  Joseph R. Cavallaro,et al.  FPGA Implementation of Matrix Inversion Using QRD-RLS Algorithm , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..

[17]  Tingwen Huang,et al.  Outsourcing Large Matrix Inversion Computation to A Public Cloud , 2013, IEEE Transactions on Cloud Computing.

[18]  Sarah B. Murthi,et al.  Ultrasound Physics and Equipment , 2010 .

[19]  V. Carl Hamacher,et al.  An Augmented Iterative Array for High-Speed Binary Division , 1973, IEEE Transactions on Computers.

[20]  Aravindh Krishnamoorthy,et al.  Matrix inversion using Cholesky decomposition , 2011, 2013 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA).

[21]  Feng Ding,et al.  Decomposition based fast least squares algorithm for output error systems , 2013, Signal Process..

[22]  Arash Fayyazi,et al.  SEERAD: A high speed yet energy-efficient rounding-based approximate divider , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[23]  Behrooz Parhami,et al.  Computer arithmetic - algorithms and hardware designs , 1999 .

[24]  D. Reeve Diagnostic Ultrasound: Physics and Equipment , 2012, The Journal of Nuclear Medicine.

[25]  Fabrizio Lombardi,et al.  Adaptive approximation in arithmetic circuits: A low-power unsigned divider design , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[26]  Daniel Ménard,et al.  The hidden cost of functional approximation against careful data sizing — A case study , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[27]  J. Arendt Paper presented at the 10th Nordic-Baltic Conference on Biomedical Imaging: Field: A Program for Simulating Ultrasound Systems , 1996 .

[28]  Raziel Haimi-Cohen,et al.  Image Compression Based on Compressive Sensing: End-to-End Comparison With JPEG , 2017, IEEE Transactions on Multimedia.