Versatile Montgomery Multiplier Architectures

Several algorithms for Public Key Cryptography (PKC), such as RSA, Diffie-Hellman, and Elliptic Curve Cryptography, require modular multiplication of very large operands (sizes from 160 to 4096 bits) as their core arithmetic operation. To perform this operation reasonably fast, general purpose processors are not always the best choice. This is why specialized hardware, in the form of cryptographic co-processors, become more attractive. Based upon the analysis of recent publications on hardware design for modular multiplication, this M.S. thesis presents a new architecture that is scalable with respect to word size and pipelining depth. To our knowledge, this is the first time a word based algorithm for Montgomery’s method is realized using high-radix bit-parallel multipliers working with two different types of finite fields (unified architecture for GF (p) and GF (2)). Previous approaches have relied mostly on bit serial multiplication in combination with massive pipelining, or Radix-8 multiplication with the limitation to a single type of finite field. Our approach is centered around the notion that the optimal delay in bit-parallel multipliers grows with logarithmic complexity with respect to the operand size n, O(log3/2 n), while the delay of bit serial implementations grows with linear i complexity O(n). Our design has been implemented in VHDL, simulated and synthesized in 0.5μ CMOS technology. The synthesized net list has been verified in back-annotated timing simulations and analyzed in terms of performance and area consumption.

[1]  Andrew D. Booth,et al.  A SIGNED BINARY MULTIPLICATION TECHNIQUE , 1951 .

[2]  Tolga Acar,et al.  Analyzing and comparing Montgomery multiplication algorithms , 1996, IEEE Micro.

[3]  Johann Großschädl,et al.  A Bit-Serial Unified Multiplier Architecture for Finite Fields GF(p) and GF(2m) , 2001, CHES.

[4]  Colleen Marie O'Rourke Efficient NTRU Implementations , 2002 .

[5]  Alfred Menezes,et al.  Handbook of Applied Cryptography , 2018 .

[6]  I. Koren Computer arithmetic algorithms , 2018 .

[7]  Çetin Kaya Koç,et al.  A Scalable Architecture for Montgomery Multiplication , 1999, CHES.

[8]  ÇETIN K. KOÇ,et al.  Montgomery Multiplication in GF(2k) , 1998, Des. Codes Cryptogr..

[9]  Michael J. Flynn,et al.  Advanced Computer Arithmetic Design , 2001 .

[10]  Johann Großschädl High-Speed RSA Hardware Based on Barret's Modular Reduction Method , 2000, CHES.

[11]  Vojin G. Oklobdzija,et al.  A Method for Speed Optimized Partial Product Reduction and Generation of Fast Parallel Multipliers Using an Algorithmic Approach , 1996, IEEE Trans. Computers.

[12]  Whitfield Diffie,et al.  New Directions in Cryptography , 1976, IEEE Trans. Inf. Theory.

[13]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[14]  Erkay Savas,et al.  A Scalable and Unified Multiplier Architecture for Finite Fields GF(p) and GF(2m) , 2000, CHES.

[15]  Çetin Kaya Koç,et al.  High-Radix Design of a Scalable Modular Multiplier , 2001, CHES.

[16]  Arjen K. Lenstra,et al.  Selecting Cryptographic Key Sizes , 2000, Journal of Cryptology.

[17]  Christopher S. Wallace,et al.  A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..

[18]  Daniel J. Bernstein,et al.  Circuits for Integer Factorization: A Proposal , 2001 .