Efficient Modular Multiplication

This chapter is concerned with one of the fundamental building blocks used in modern publickey cryptography: modular multiplication. Speed-ups applied to the modular multiplication algorithm or implementation directly translate in a faster modular exponentiation for RSA or a faster realization of the group law when using elliptic curve cryptography. This chapter outlines the most commonly used modular multiplication method Montgomery multiplication for generic moduli as well as different techniques when “special” moduli of a particular shape are used. Moreover, we study approaches which might produce errors with a very small probability. Such faster “sloppy reduction” techniques are especially beneficial in cryptanalytic settings. We look at this from both a historical as well as an applied implementation perspective. The best approach to implement modular multiplication on a modern 64-bit architecture with advanced single-instruction, multiple data instruction set extensions is, for example, quite different from the best approach on resource constrained embedded devices. Throughout this chapter we focus on the cryptographic setting unless we specifically discuss an algorithm for cryptanalysis. Contrary to many mathematical software applications, the running time of a cryptographic implementation (and hereby also the modular multiplication) should avoid secret-data-dependent branches and secretly indexed memory access. Such constant time implementations are one of the basic countermeasures against timing attacks: advanced techniques which use information about the running time of the target algorithm to extract the used private key. Such attacks are part of a larger family of attacks known as side-channel attacks. Throughout this chapter we represent a wn-bit non-negative integer X in the so-called radix-2w representation,

[1]  Paul Barrett,et al.  Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor , 1986, CRYPTO.

[2]  C. D. Walter,et al.  Montgomery exponentiation needs no final subtractions , 1999 .

[3]  T. Acar,et al.  Modular Reduction without Pre-computation for Special Moduli , 2010 .

[4]  Behrooz Parhami,et al.  Computer arithmetic - algorithms and hardware designs , 1999 .

[5]  Bruce Schneier,et al.  Cryptography Engineering - Design Principles and Practical Applications , 2010 .

[6]  Marc Joye,et al.  Faster Double-Size Modular Multiplication from Euclidean Multipliers , 2003, CHES.

[7]  Arjen K. Lenstra,et al.  Selecting Cryptographic Key Sizes , 2000, Journal of Cryptology.

[8]  Daniel Shumow,et al.  Montgomery Multiplication Using Vector Instructions , 2013, Selected Areas in Cryptography.

[9]  Simon Singh,et al.  The code book : the secret history of codes and codebreaking , 2000 .

[10]  H. W. Lenstra,et al.  Factoring integers with elliptic curves , 1987 .

[11]  Colin D. Walter Faster Modular Multiplication by Operand Scaling , 1991, CRYPTO.

[12]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[13]  Daniel M. Gordon,et al.  A Survey of Fast Exponentiation Methods , 1998, J. Algorithms.

[14]  Daniel J. Bernstein,et al.  Curve25519: New Diffie-Hellman Speed Records , 2006, Public Key Cryptography.

[15]  Jean-Jacques Quisquater,et al.  Montgomery Exponentiation with no Final Subtractions: Improved Results , 2000, CHES.

[16]  Marc Joye On Quisquater's Multiplication Algorithm , 2012, Cryptography and Security.

[17]  Laszlo Hars,et al.  Long Modular Multiplication for Cryptographic Applications , 2004, CHES.

[18]  Michael J. Flynn,et al.  Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.

[19]  Frederik Vercauteren,et al.  Speeding Up Bipartite Modular Multiplication , 2010, WAIFI.

[20]  N. Koblitz Elliptic curve cryptosystems , 1987 .

[21]  Joppe W. Bos High-Performance Modular Multiplication on the Cell Processor , 2010, WAIFI.

[22]  Shay Gueron Enhanced Montgomery Multiplication , 2002, CHES.

[23]  Pascal Paillier Low-Cost Double-Size Modular Exponentiation or How to Stretch Your Cryptoprocessor , 1999, Public Key Cryptography.

[24]  Arjen K. Lenstra,et al.  Generating RSA Moduli with a Predetermined Portion , 1998, ASIACRYPT.

[25]  Éric Schost,et al.  Genus 2 point counting over prime fields , 2012, J. Symb. Comput..

[26]  Arjen K. Lenstra,et al.  Efficient SIMD Arithmetic Modulo a Mersenne Number , 2011, 2011 IEEE 20th Symposium on Computer Arithmetic.

[27]  Arjen K. Lenstra,et al.  Solving a 112-bit prime elliptic curve discrete logarithm problem on game consoles using sloppy reduction , 2012, Int. J. Appl. Cryptogr..

[28]  Tolga Acar,et al.  High-speed algorithms and architectures for number-theoretic cryptosystems , 1998 .

[29]  Mihir Bellare,et al.  Optimal Asymmetric Encryption , 1994, EUROCRYPT.

[30]  Peter Schwabe,et al.  High-speed Curve25519 on 8-bit, 16-bit, and 32-bit microcontrollers , 2015, Des. Codes Cryptogr..

[31]  Jean-Pierre Seifert,et al.  Increasing the Bitlength of a Crypto-Coprocessor , 2002, CHES.

[32]  Arjen K. Lenstra,et al.  Unbelievable Security. Matching AES Security Using Public Key Systems , 2001, ASIACRYPT.

[33]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[34]  Tanja Lange,et al.  High-speed high-security signatures , 2011, Journal of Cryptographic Engineering.

[35]  C. D. Walter,et al.  Montgomery's Multiplication Technique: How to Make It Smaller and Faster , 1999, CHES.

[36]  Alan T. Sherman,et al.  VLSI Placement and Routing: The PI Project , 1989, Texts and Monographs in Computer Science.

[37]  Craig Costello,et al.  Fast Cryptography in Genus 2 , 2013, Journal of Cryptology.

[38]  Scott A. Vanstone,et al.  Short RSA keys and their generation , 2004, Journal of Cryptology.

[39]  Nigel P. Smart,et al.  Efficient 15, 360-bit RSA Using Woop-Optimised Montgomery Arithmetic , 2007, IMACC.

[40]  Katsuyuki Okeya,et al.  Double-Size Bipartite Modular Multiplication , 2007, ACISP.

[41]  Michael J. Wiener,et al.  Cryptanalysis of Short RSA Secret Exponents (Abstract) , 1990, EUROCRYPT.

[42]  Nigel P. Smart,et al.  Parallel cryptographic arithmetic using a redundant Montgomery representation , 2004, IEEE Transactions on Computers.

[43]  Craig Costello,et al.  High-Performance Scalar Multiplication Using 8-Dimensional GLV/GLS Decomposition , 2013, CHES.

[44]  Dan Boneh,et al.  TWENTY YEARS OF ATTACKS ON THE RSA CRYPTOSYSTEM , 1999 .

[45]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[46]  Eli Biham,et al.  A Fast New DES Implementation in Software , 1997, FSE.

[47]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[48]  Michael Hamburg,et al.  Fast and compact elliptic-curve cryptography , 2012, IACR Cryptol. ePrint Arch..

[49]  Tibor Juhas The use of elliptic curves in cryptography , 2007 .