Pipelined Squarer for Unsigned Integers of Up to 12 Bits

Suggested is a pipelined squarer for a positive integer of no more than 16 bits. According to the present invention, the pipelined squarer for a positive integer of no more than 16 bits calculates a partial product matrix by connecting an inputted positive integer of no more than 16 bits with a plurality of integers which do not include symbols; the pipelined squarer comprises a multiplier with a plurality of stages, and the multiplier with a plurality of stages are used to divide the partial product matrix into a plurality of parts to process a square calculation on the positive integer of no more than 16 bits sequentially from the first stage; the parts of the partial product matrix are calculated as an intermediate value comprising a sum and a carry, and partial product matrix results of previous stages are accumulatively calculated from the second stage of the stages of the multiplier; and a sum and a carry vector from a previous stage are added up to calculate a final square value at the last stage of the stages of the multiplier, and a maximum delay time for each of the stages of the multiplier differs depending on the maximum number of bit lines of the partial product matrix to be processed at each of the stages, and a carry-save adder (CSA) tree is limited by analyzing the maximum number of bit lines of the partial product matrix to be accumulated by the CSA tree to provide an optimized squarer.

[1]  Kyung-Ju Cho Efficient unsigned squarer design techniques , 2012, IEICE Electron. Express.

[2]  Son Bui,et al.  Additional optimizations for parallel squarer units , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[3]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[4]  Paolo Ienne,et al.  Efficient synthesis of compressor trees on FPGAs , 2008, 2008 Asia and South Pacific Design Automation Conference.

[5]  Eriko Nurvitadhi,et al.  Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC , 2016, 2016 International Conference on Field-Programmable Technology (FPT).

[6]  Yuuki Tanaka,et al.  Efficient squaring circuit using canonical signed-digit number representation , 2014, IEICE Electron. Express.

[7]  Amine Bermak,et al.  A High-speed 32-bit Signed/Unsigned Pipelined Multiplier , 2010, 2010 Fifth IEEE International Symposium on Electronic Design, Test & Applications.

[8]  Ganesh Gopalakrishnan,et al.  A fast parallel squarer based on divide-and-conquer , 1997 .

[9]  David W. Matula Higher Radix Squaring Operations Employing Left-to-Right Dual Recoding , 2009, 2009 19th IEEE Symposium on Computer Arithmetic.

[10]  Davide De Caro,et al.  Booth Folding Encoding for High Performance , 2003 .