Efficient Ring-LWE Encryption on 8-Bit AVR Processors

Public-key cryptography based on the “ring-variant” of the Learning with Errors (ring-LWE) problem is both efficient and believed to remain secure in a post-quantum world. In this paper, we introduce a carefully-optimized implementation of a ring-LWE encryption scheme for 8-bit AVR processors like the ATxmega128. Our research contributions include several optimizations for the Number Theoretic Transform (NTT) used for polynomial multiplication. More concretely, we describe the Move-and-Add (MA) and the Shift-Add-Multiply-Subtract-Subtract (SAMS2) technique to speed up the performance-critical multiplication and modular reduction of coefficients, respectively. We take advantage of incompletely-reduced intermediate results to minimize the total number of reduction operations and use a special coefficient-storage method to decrease the RAM footprint of NTT multiplications. In addition, we propose a byte-wise scanning strategy to improve the performance of a discrete Gaussian sampler based on the Knuth-Yao random walk algorithm. For medium-term security, our ring-LWE implementation needs 590 k, 672 k, and 276 k clock cycles for key-generation, encryption, and decryption, respectively. On the other hand, for long-term security, the execution time of key-generation, encryption, and decryption amount to 2.2 M, 2.6 M, and 686 k cycles, respectively. These results set new speed records for ring-LWE encryption on an 8-bit processor and outperform related RSA and ECC implementations by an order of magnitude.

[1]  Sorin A. Huss,et al.  On the Design of Hardware Building Blocks for Modern Lattice-Based Encryption Schemes , 2012, CHES.

[2]  Peter Schwabe,et al.  High-speed Curve25519 on 8-bit, 16-bit, and 32-bit microcontrollers , 2015, Des. Codes Cryptogr..

[3]  Daniel J. Bernstein,et al.  Curve25519: New Diffie-Hellman Speed Records , 2006, Public Key Cryptography.

[4]  Frederik Vercauteren,et al.  Efficient software implementation of ring-LWE encryption , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  Michael Naehrig,et al.  Improved Security for a Ring-Based Fully Homomorphic Encryption Scheme , 2013, IMACC.

[6]  Ç. Koç,et al.  Incomplete reduction in modular arithmetic , 2002 .

[7]  Steven D. Galbraith,et al.  Sampling from discrete Gaussians for lattice-based cryptography on a constrained device , 2014, Applicable Algebra in Engineering, Communication and Computing.

[8]  Tim Güneysu,et al.  Beyond ECDSA and RSA: Lattice-based digital signatures on constrained devices , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[9]  Ricardo Dahab,et al.  Efficient implementation of elliptic curve cryptography in wireless sensors , 2010, Adv. Math. Commun..

[10]  Dave Evans,et al.  How the Next Evolution of the Internet Is Changing Everything , 2011 .

[11]  Frederik Vercauteren,et al.  High Precision Discrete Gaussian Sampling on FPGAs , 2013, Selected Areas in Cryptography.

[12]  Rasool Jalili,et al.  Implementation and Comparison of Lattice-based Identification Protocols on Smart Cards and Microcontrollers , 2014, IACR Cryptol. ePrint Arch..

[13]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[14]  Tim Güneysu,et al.  Enhanced Lattice-Based Signatures on Reconfigurable Hardware , 2014, CHES.

[15]  Zhe Liu,et al.  MoTE-ECC: Energy-Scalable Elliptic Curve Cryptography for Wireless Sensor Networks , 2014, ACNS.

[16]  Deian Stefan,et al.  Fast Software AES Encryption , 2010, FSE.

[17]  Rasool Jalili,et al.  On Constrained Implementation of Lattice-Based Cryptographic Primitives and Schemes on Smart Cards , 2015, ACM Trans. Embed. Comput. Syst..

[18]  Peter W. Shor,et al.  Algorithms for quantum computation: discrete logarithms and factoring , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[19]  Frederik Vercauteren,et al.  Compact Ring-LWE Cryptoprocessor , 2014, CHES.

[20]  Oded Regev,et al.  On lattices, learning with errors, random linear codes, and cryptography , 2005, STOC '05.

[21]  Chris Peikert,et al.  On Ideal Lattices and Learning with Errors over Rings , 2010, JACM.

[22]  Ilya Kizhvatov,et al.  Efficient and Side-Channel Resistant RSA Implementation for 8-bit AVR Microcontrollers , 2010 .

[23]  Tim Güneysu,et al.  Speed Records for Ideal Lattice-Based Cryptography on AVR , 2015, IACR Cryptology ePrint Archive.

[24]  Tim Güneysu,et al.  Towards Efficient Arithmetic for Lattice-Based Cryptography on Reconfigurable Hardware , 2012, LATINCRYPT.

[25]  Zhe Liu,et al.  New Speed Records for Montgomery Modular Multiplication on 8-Bit AVR Microcontrollers , 2014, AFRICACRYPT.

[26]  Toby Prescott Random Number Generation Using AES , 2011 .

[27]  Hans Eberle,et al.  Comparing Elliptic Curve Cryptography and RSA on 8-bit CPUs , 2004, CHES.

[28]  Andrew Chi-Chih Yao,et al.  The complexity of nonuniform random number generation , 1976 .

[29]  Patrick Schaumont,et al.  Low-cost and area-efficient FPGA implementations of lattice-based cryptography , 2013, 2013 IEEE International Symposium on Hardware-Oriented Security and Trust (HOST).

[30]  Zhe Liu,et al.  Small Private Key PKS on an Embedded Microprocessor , 2014, Sensors.