Automatic generation of high-performance multipliers for FPGAs with asymmetric multiplier blocks

The introduction of asymmetric embedded multiplier blocks in recent Xilinx FPGAs complicates the design of larger multiplier sizes. The two different input bitwidths of the embedded multipliers lead to two different shifting factors for the partial product outputs. This makes even the most straightforward multiplier design less intuitive. In this paper, we present a methodology and set of equations to automatically generate the Verilog for a multiplier using asymmetric embedded multiplier cores. The presented technique also uses intelligent rearrangement of the multiplier block outputs into partial product terms to reduce the overall delay of the circuit. Multipliers created with our generator are faster and use fewer DSP blocks than those created using Xilinx Core Generator or by simply using the '*' operator in Verilog. It also uses fewer LUTs than those created using the '*' operator. Finally, the presented generator can create multipliers larger than possible with Core Generator, and is limited only by the number of available embedded multipliers.

[1]  Paolo Ienne,et al.  Efficient synthesis of compressor trees on FPGAs , 2008, 2008 Asia and South Pacific Design Automation Conference.

[2]  M. B. Srinivas,et al.  High Speed Efficient N X N Bit Parallel Hierarchical Overlay Multiplier Architecture Based On Ancient Indian Vedic Mathematics , 2004 .

[3]  Dhamin Al-Khalili,et al.  Optimised realisations of large integer multipliers and squarers using embedded blocks , 2007, IET Comput. Digit. Tech..

[4]  Milos D. Ercegovac,et al.  High-performance left-to-right array multiplier design , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[5]  Arnaud Tisserand,et al.  Small Multiplier-Based Multiplication and Division Operators for Virtex-II Devices , 2002, FPL.

[6]  Uwe Meyer-Baese,et al.  Digital Signal Processing with Field Programmable Gate Arrays , 2001 .

[7]  Russell Tessier,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Reconfigurable Computing for Digital Signal Processing: A Survey ∗ , 1999 .

[8]  Jean-Luc Gaudiot,et al.  A Simple High-Speed Multiplier Design , 2006, IEEE Transactions on Computers.

[9]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[10]  Dhamin Al-Khalili,et al.  Efficient Scheme for Implementing Large Size Signed Multipliers Using Multigranular Embedded DSP Blocks in FPGAs , 2009, Int. J. Reconfigurable Comput..

[11]  Christopher S. Wallace,et al.  A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..

[12]  Israel Koren Computer arithmetic algorithms , 1993 .

[13]  Neil Burgess,et al.  Improved small multiplier based multiplication, squaring and division , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..

[14]  Russell Tessier,et al.  FPGA Architecture: Survey and Challenges , 2008, Found. Trends Electron. Des. Autom..

[15]  Chein-Wei Jen,et al.  High-Speed Booth Encoded Parallel Multiplier Design , 2000, IEEE Trans. Computers.

[16]  Jonathan Rose,et al.  Measuring the Gap Between FPGAs and ASICs , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Shuli Gao,et al.  Optimized realization of large-size two’s complement multipliers on FPGAs , 2007, 2007 IEEE Northeast Workshop on Circuits and Systems.

[18]  Florent de Dinechin,et al.  Large multipliers with fewer DSP blocks , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[19]  Behrooz Parhami,et al.  Computer arithmetic - algorithms and hardware designs , 1999 .

[20]  Vojin G. Oklobdzija,et al.  A Method for Speed Optimized Partial Product Reduction and Generation of Fast Parallel Multipliers Using an Algorithmic Approach , 1996, IEEE Trans. Computers.

[21]  Himanshu Thapliyal,et al.  VLSI implementation of RSA encryption system using ancient Indian Vedic mathematics , 2005, SPIE Microtechnologies.

[22]  Mircea Vladutiu,et al.  Computer Arithmetic , 2012, Springer Berlin Heidelberg.

[23]  Amin Farmahini Farahani,et al.  FPGA Design Analysis of the Clustering Algorithm for the CERN Large Hadron Collider , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[24]  Earl E. Swartzlander,et al.  Computer Arithmetic , 1980 .