Fast S-Box Substitution Instructions and Their Hardware Implementation for Accelerating Symmetric Cryptographic Processing

In popular symmetric ciphers, S-box substitution is the core operation that dominates executions of cryptographic algorithms. In this paper, a method of application-specific instruction-set extension is used for accelerating the key operation in symmetric cryptography. Two instructions for S-box access are designed by constructing a novel flexible on-chip parallel substitution box unit that consists of multiple lookup tables and a post-processing module. The box unit is integrated into the 32-bit configurable Leon2 processor. Configuration of Leon2 is presented. Implementing this extended processor core on Virtex-II XC2V3000 FPGA shows that the parallel substitution box unit uses very small amount of hardware resources (1 KB of memory and some logic circuits). Evaluation of the performance of S-box access instructions for AES is conducted according to Amdahl Law, and the results show that overall speedup of greater than 2 can be achieved. Benefits for other symmetric ciphers using S-box substitution as their core operation are accordingly expected.