Evaluation of Large Integer Multiplication Methods on Hardware

Multipliers requiring large bit lengths have a major impact on the performance of many applications, such as cryptography, digital signal processing (DSP) and image processing. Novel, optimised designs of large integer multiplication are needed as previous approaches, such as schoolbook multiplication, may not be as feasible due to the large parameter sizes. Parameter bit lengths of up to millions of bits are required for use in cryptography, such as in lattice-based and fully homomorphic encryption (FHE) schemes. This paper presents a comparison of hardware architectures for large integer multiplication. Several multiplication methods and combinations thereof are analysed for suitability in hardware designs, targeting the FPGA platform. In particular, the first hardware architecture combining Karatsuba and Comba multiplication is proposed. Moreover, a hardware complexity analysis is conducted to give results independent of any particular FPGA platform. It is shown that hardware designs of combination multipliers, at a cost of additional hardware resource usage, can offer lower latency compared to individual multiplier designs. Indeed, the proposed novel combination hardware design of the Karatsuba-Comba multiplier offers lowest latency for integers greater than 512 bits. For large multiplicands, greater than 16,384 bits, the hardware complexity analysis indicates that the NTT-Karatsuba-Schoolbook combination is most suitable.

[1]  Craig Gentry,et al.  Fully Homomorphic Encryption with Polylog Overhead , 2012, EUROCRYPT.

[2]  Xinming Huang,et al.  VLSI Design of a Large-Number Multiplier for Fully Homomorphic Encryption , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Tim Güneysu,et al.  Towards Efficient Arithmetic for Lattice-Based Cryptography on Reconfigurable Hardware , 2012, LATINCRYPT.

[4]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[5]  Xiaolin Cao,et al.  Targeting FPGA DSP Slices for a Large Integer Multiplier for Integer Based FHE , 2013, Financial Cryptography Workshops.

[6]  Vinod Vaikuntanathan,et al.  Cloud-Assisted Multiparty Computation from Fully Homomorphic Encryption , 2011, IACR Cryptol. ePrint Arch..

[7]  Nadia Nedjah,et al.  A Review of Modular Multiplication Methods ands Respective Hardware Implementation , 2006, Informatica.

[8]  Anatolij A. Karatsuba,et al.  Multiplication of Multidigit Numbers on Automata , 1963 .

[9]  Lukas Malina,et al.  Accelerated modular arithmetic for low-performance devices , 2011, 2011 34th International Conference on Telecommunications and Signal Processing (TSP).

[10]  Xiaolin Cao,et al.  High-Speed Fully Homomorphic Encryption Over the Integers , 2014, Financial Cryptography Workshops.

[11]  Xiaolin Cao,et al.  Optimised Multiplication Architectures for Accelerating Fully Homomorphic Encryption , 2016, IEEE Transactions on Computers.

[12]  J. David,et al.  Hardware implementation of large number multiplication by FFT with modular arithmetic , 2005, The 3rd International IEEE-NEWCAS Conference, 2005..

[13]  Craig Gentry,et al.  A fully homomorphic encryption scheme , 2009 .

[14]  Arnold Schönhage,et al.  Schnelle Multiplikation großer Zahlen , 1971, Computing.

[15]  Frederik Vercauteren,et al.  High-Speed Polynomial Multiplication Architecture for Ring-LWE and SHE Cryptosystems , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.

[16]  Berk Sunar,et al.  Exploring the Feasibility of Fully Homomorphic Encryption , 2015, IEEE Transactions on Computers.

[17]  Craig Gentry,et al.  Ring Switching in BGV-Style Homomorphic Encryption , 2012, SCN.

[18]  Xinming Huang,et al.  FPGA implementation of a large-number multiplier for fully homomorphic encryption , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[19]  Berk Sunar,et al.  Accelerating fully homomorphic encryption using GPU , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[20]  Vinod Vaikuntanathan,et al.  On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption , 2012, STOC '12.

[21]  Frederik Vercauteren,et al.  Faster Interleaved Modular Multiplication Based on Barrett and Montgomery Reduction Methods , 2010, IEEE Transactions on Computers.

[22]  Berk Sunar,et al.  A million-bit multiplier architecture for fully homomorphic encryption , 2014, Microprocess. Microsystems.

[23]  Berk Sunar,et al.  Evaluating the Hardware Performance of a Million-Bit Multiplier , 2013, 2013 Euromicro Conference on Digital System Design.

[24]  Berk Sunar,et al.  Achieving efficient polynomial multiplication in fermat fields using the fast Fourier transform , 2006, ACM-SE 44.

[25]  Máire O'Neill,et al.  Accelerating integer-based fully homomorphic encryption using Comba multiplication , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).

[26]  S. Cook,et al.  ON THE MINIMUM COMPUTATION TIME OF FUNCTIONS , 1969 .

[27]  Francisco Rodríguez-Henríquez,et al.  Hardware Design of a 256-Bit Prime Field Multiplier Suitable for Computing Bilinear Pairings , 2011, 2011 International Conference on Reconfigurable Computing and FPGAs.

[28]  Frederik Vercauteren,et al.  Compact Ring-LWE Cryptoprocessor , 2014, CHES.

[29]  Wayne Luk,et al.  A Karatsuba-Based Montgomery Multiplier , 2010, 2010 International Conference on Field Programmable Logic and Applications.

[30]  Keshab K. Parhi,et al.  High-Throughput VLSI Architecture for FFT Computation , 2007, IEEE Transactions on Circuits and Systems II: Express Briefs.

[31]  L. Leibowitz A simplified binary arithmetic for the Fermat number transform , 1976 .

[32]  M. McLoone,et al.  Fast Montgomery modular multiplication and RSA cryptographic processor architectures , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[33]  Ismail San,et al.  On Increasing the Computational Efficiency of Long Integer Multiplication on FPGA , 2012, 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications.

[34]  Peter L. Montgomery,et al.  Five, six, and seven-term Karatsuba-like formulae , 2005, IEEE Transactions on Computers.

[35]  Tim Güneysu,et al.  Utilizing hard cores of modern FPGA devices for high-performance cryptography , 2011, Journal of Cryptographic Engineering.

[36]  Jean-Sébastien Coron,et al.  Public Key Compression and Modulus Switching for Fully Homomorphic Encryption over the Integers , 2012, EUROCRYPT.

[37]  Charles C. Weems,et al.  High Precision Integer Multiplication with a GPU Using Strassen's Algorithm with Multiple FFT Sizes , 2011, Parallel Process. Lett..

[38]  Kassem Kalach,et al.  Hardware Complexity of Modular Multiplication and Exponentiation , 2007, IEEE Transactions on Computers.

[39]  Hossam M. A. Fahmy,et al.  Modular Multiplication for Public Key Cryptography on FPGAs , 2009, 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology.

[40]  Nicolas Brisebarre,et al.  High-Throughput Hardware Architecture for the SWIFFT / SWIFFTX Hash Functions , 2012, IACR Cryptol. ePrint Arch..

[41]  Paul G. Comba,et al.  Exponentiation Cryptosystems on the IBM PC , 1990, IBM Syst. J..

[42]  Roberto Maria Avanzi,et al.  Energy-Efficient Software Implementation of Long Integer Modular Arithmetic , 2005, CHES.