Processor accelerator for AES

Software AES cipher performance is not fast enough for encryption to be incorporated ubiquitously for all computing needs. Furthermore, fast software implementations of AES that use table lookups are susceptible to software cache-based side channel attacks, leaking the secret encryption key. To bridge the gap between software and hardware AES implementations, several Instruction Set Architecture (ISA) extensions have been proposed to provide speedup for software AES programs, most notably the recent introduction of six AES-specific instructions for Intel microprocessors. However, algorithm-specific instructions are less desirable than general-purpose ones for microprocessors. In this paper, we propose an enhanced parallel table lookup instruction that can achieve the fastest reported software AES encryption and decryption of 1.38 cycles/byte for general-purpose microprocessors, a 1.45X speedup from the fastest prior work reported. Also, security is improved where cache-based side-channel attacks are thwarted, since all table lookups take the same amount of time. Furthermore, the new instructions can also be used to accelerate any functions that can be accelerated through table lookup operations of one or multiple small tables.

[1]  Vincent Rijmen,et al.  The Design of Rijndael , 2002, Information Security and Cryptography.

[2]  Colin Percival CACHE MISSING FOR FUN AND PROFIT , 2005 .

[3]  Ruby B. Lee,et al.  On-chip lookup tables for fast symmetric-key encryption , 2005, 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05).

[4]  Ruby B. Lee,et al.  New cache designs for thwarting software cache-based side channel attacks , 2007, ISCA '07.

[5]  Rainer Buchty,et al.  AES and the cryptonite crypto processor , 2003, CASES '03.

[6]  Jean-Didier Legat,et al.  Efficient Implementation of Rijndael Encryption in Reconfigurable Hardware: Improvements and Design Tradeoffs , 2003, CHES.

[7]  Peter Schwabe,et al.  New AES Software Speed Records , 2008, INDOCRYPT.

[8]  John Waldron,et al.  Practical Symmetric Key Cryptography on Modern Graphics Hardware , 2008, USENIX Security Symposium.

[9]  K.K. Parhi,et al.  An efficient 21.56 Gbps AES implementation on FPGA , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[10]  Giovanni Agosta,et al.  Design of a parallel AES for graphics hardware using the CUDA framework , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[11]  Johann Großschädl,et al.  Instruction Set Extensions for Efficient AES Implementation on 32-bit Processors , 2006, CHES.

[12]  Onur Aciiçmez,et al.  Cache Based Remote Timing Attack on the AES , 2007, CT-RSA.

[13]  Joseph Bonneau,et al.  Cache-Collision Timing Attacks Against AES , 2006, CHES.

[14]  Peter Schwabe,et al.  Faster and Timing-Attack Resistant AES-GCM , 2009, CHES.

[15]  Ingrid Verbauwhede,et al.  A 3.84 gbits/s AES crypto coprocessor with modes of operation in a 0.18-μm CMOS technology , 2005, ACM Great Lakes Symposium on VLSI.