论文信息 - Fly, you fool! Faster Frodo for the ARM Cortex-M4

Fly, you fool! Faster Frodo for the ARM Cortex-M4

We present an efficient implementation of FrodoKEM-640 on an ARM Cortex-M4 core. We leverage the single instruction, multiple data paradigm, available in the instruction set of the ARM Cortex-M4, together with a careful analysis of the memory layout of matrices to considerably speed up matrix multiplications. Our implementations take up to 79.4% less cycles than the reference. Moreover, we challenge the usage of a cryptographically secure pseudorandom number generator for the generation of the large public matrix involved. We argue that statistically good pseudorandomness is enough to achieve the same security goal. Therefore, we propose to use xoshiro128∗∗ as a PRNG instead: its structure can be easily integrated in FrodoKEM-640, it passes all known statistical tests and greatly outperforms previous choices. By using xoshiro128∗∗ we improve the generation of the large public matrix, which is a considerable bottleneck for embedded devices, by up to 96%.

[1] Peter Schwabe,et al. Faster multiplication in ℤ2m[x] on Cortex-M4 to speed up NIST PQC candidates , 2018, IACR Cryptol. ePrint Arch..

[2] Tim Güneysu,et al. Standard Lattice-Based Key Encapsulation on Embedded Devices , 2018, IACR Cryptol. ePrint Arch..

[3] Peter Schwabe,et al. All the AES You Need on Cortex-M3 and M4 , 2016, SAC.

[4] Damien Stehlé,et al. Worst-case to average-case reductions for module lattices , 2014, Designs, Codes and Cryptography.

[5] LangloisAdeline,et al. Worst-case to average-case reductions for module lattices , 2015 .

[6] Oded Regev,et al. On lattices, learning with errors, random linear codes, and cryptography , 2005, STOC '05.

[7] Chris Peikert,et al. On Ideal Lattices and Learning with Errors over Rings , 2010, JACM.

[8] Pierre L'Ecuyer,et al. TestU01: A C library for empirical testing of random number generators , 2006, TOMS.