The design space of the number theoretic transform: A survey

The Number Theoretic Transform (NTT) is a necessary part of most Lattice-based cryptographic schemes. In particular, it offers an efficient means to achieve polynomial multiplication within the more efficient ring-based schemes. The NTT is also a crucial component which needs to be implemented in a critical way, since it is often the bottle-neck and the most resource consuming block of the whole design. As a result, the NTT is an appealing target for exploring different architectures and design trade-offs. In this paper, we compare various optimization strategies applied to maximize the performance or to reduce the resource utilization. Our analysis covers general purpose processors as well as dedicated hardware implemented on reconfigurable platforms and on ASIC. Previously explored design strategies range from the traditional computation where the multiplicative factors (called twiddle factors) are calculated on-the-fly versus memory trade-off exploration (using memory to store pre-computed twiddle factors), to the use of different butterfly designs for implementing the Fast Fourier Transform and its inverse in software, or the sharing of resources for hardware implementations of the forward and inverse NTT. The problem of side channel resistance is also addressed, discussing designs which are robust against power analysis attacks.

[1]  Alan George,et al.  Inside the FFT Black Box: Serial and Parallel Fast Fourier Transform Algorithms , 2019 .

[2]  Chris Peikert,et al.  SWIFFT: A Modest Proposal for FFT Hashing , 2008, FSE.

[3]  Máire O'Neill,et al.  Lattice-based cryptography: From reconfigurable hardware to ASIC , 2016, 2016 International Symposium on Integrated Circuits (ISIC).

[4]  Damien Stehlé,et al.  CRYSTALS - Kyber: A CCA-Secure Module-Lattice-Based KEM , 2017, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[5]  Erdem Alkim,et al.  Post-quantum Key Exchange - A New Hope , 2016, USENIX Security Symposium.

[6]  Léo Ducas,et al.  Lattice Signatures and Bimodal Gaussians , 2013, IACR Cryptol. ePrint Arch..

[7]  Frederik Vercauteren,et al.  Compact Ring-LWE Cryptoprocessor , 2014, CHES.

[8]  Vinod Vaikuntanathan,et al.  On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption , 2012, STOC '12.

[9]  William H. Press,et al.  Numerical recipes , 1990 .

[10]  Kim-Fung Man,et al.  Reconfigurable Number Theoretic Transform architectures for cryptographic applications , 2010, 2010 International Conference on Field-Programmable Technology.

[11]  Paul Barrett,et al.  Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor , 1986, CRYPTO.

[12]  Stefan Mangard,et al.  Single-Trace Side-Channel Attacks on Masked Lattice-Based Encryption , 2017, CHES.

[13]  Patrick Longa,et al.  Speeding up the Number Theoretic Transform for Faster Ideal Lattice-Based Cryptography , 2016, CANS.

[14]  Tim Güneysu,et al.  High-Performance Ideal Lattice-Based Cryptography on 8-Bit ATxmega Microcontrollers , 2015, LATINCRYPT.

[15]  Xiaolin Cao,et al.  Optimised Multiplication Architectures for Accelerating Fully Homomorphic Encryption , 2016, IEEE Transactions on Computers.

[16]  Damien Stehlé,et al.  CRYSTALS - Dilithium: Digital Signatures from Module Lattices , 2017, IACR Cryptol. ePrint Arch..

[17]  Mustafa Khairallah,et al.  Tile-based modular architecture for accelerating homomorphic function evaluation on FPGA , 2016, 2016 IEEE 59th International Midwest Symposium on Circuits and Systems (MWSCAS).

[18]  Léo Ducas,et al.  Efficient Identity-Based Encryption over NTRU Lattices , 2014, ASIACRYPT.

[19]  Michael Naehrig,et al.  Improved Security for a Ring-Based Fully Homomorphic Encryption Scheme , 2013, IMACC.