On-chip lookup tables for fast symmetric-key encryption

On public communication networks such as the Internet, data confidentiality can be provided by symmetric key ciphers. One of the most common operations used in symmetric key ciphers are table lookups. These frequently constitute the largest fraction of the execution time when the ciphers are implemented using a typical RISC-like instruction set. To accelerate these table lookups, we describe a new hardware module, called PTLU (for parallel table lookup), which consists of multiple lookup tables that can be accessed in parallel. A novel combinational circuit included in the module can optionally perform simple logic operations on the data read from the tables. On a single issue 64-bit RISC processor, PTLU provides maximum speedups of 7.7x for AES and 5.4x for DES. With wordsize scaling, PTLU speedups are significantly higher than that available through more conventional architectural techniques such as superscalar or VUW execution.

[1]  Ruby B. Lee,et al.  PAX : A Datapath-Scalable Minimalist Cryptographic Processor For Mobile Environments , 2003 .

[2]  Todd M. Austin,et al.  CryptoManiac: a fast flexible architecture for secure communication , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[3]  Guido Bertoni,et al.  Efficient Software Implementation of AES on 32-Bit Platforms , 2002, CHES.

[4]  Ruby B. Lee,et al.  Efficient permutation instructions for fast software cryptography , 2001 .

[5]  Ruby B. Lee Subword parallelism with MAX-2 , 1996, IEEE Micro.

[6]  Ingrid Verbauwhede,et al.  Architectural Optimization for a 1.82Gbits/sec VLSI Implementation of the AES Rijndael Algorithm , 2001, CHES.

[7]  Ruby B. Lee,et al.  PLX: An Instruction Set Architecture and Testbed for Multimedia Information Processing , 2005, J. VLSI Signal Process..

[8]  Todd M. Austin,et al.  Architectural support for fast symmetric-key cryptography , 2000, SIGP.

[9]  A. Murat Fiskiran,et al.  3 Multimedia Instructions in Microprocessors for Native Signal Processing , 2001 .

[10]  Bruce Schneier,et al.  A Performance Comparison of the Five AES Finalists , 2000, AES Candidate Conference.

[11]  Ruby B. Lee,et al.  Arbitrary bit permutations in one or two cycles , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[12]  Bernard P. Zajac Applied cryptography: Protocols, algorithms, and source code in C , 1994 .

[13]  Ruby B. Lee,et al.  Fast parallel table lookups to accelerate symmetric-key cryptography , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[14]  Shai Halevi,et al.  MARS - a candidate cipher for AES , 1999 .

[15]  Ruby B. Lee,et al.  Performance Scaling of Cryptography Operations in Servers and Mobile Clients , 2004 .