Implementation and Analysis of AES Encryption on GPU

GPU is continuing its trend of vastly outperforming CPU while becoming more general purpose. In order to improve the efficiency of AES algorithm, this paper proposed a CUDA implementation of Electronic Codebook (ECB) mode encoding process and Cipher Feedback (CBC) mode decoding process on GPU. In our implementation, the frequently accessed T-boxes were allocated on on-chip shared memory and the granularity that one thread handles a 16 Bytes AES block was adopted. Finally, we achieved the highest performance of around 60 Gbps throughput on NVIDIA Tesla C2050 GPU, which runs up to 50 times faster than a sequential implementation based on Intel Core i7-920 2.66GHz CPU. In addition, we discussed the optimization under some practical application scenarios such as overlapping GPU processing and data transfer.

[1]  William Stallings,et al.  Cryptography and Network Security: Principles and Practice , 1998 .

[2]  Morris J. Dworkin,et al.  Recommendation for Block Cipher Modes of Operation: Methods and Techniques , 2001 .

[3]  Morris J. Dworkin,et al.  SP 800-38A 2001 edition. Recommendation for Block Cipher Modes of Operation: Methods and Techniques , 2001 .

[4]  Wu En,et al.  State of the Art and Future Challenge on General Purpose Computation by Graphics Processing Unit , 2004 .

[5]  Tsutomu Sasao,et al.  An FPGA design of AES encryption circuit with 128-bit keys , 2005, ACM Great Lakes Symposium on VLSI.

[6]  Wu En-hua,et al.  State of the Art and Future Challenge on General Purpose Computation by Graphics Processing Unit , 2005 .

[7]  John Waldron,et al.  AES Encryption Implementation and Analysis on Commodity Graphics Processing Units , 2007, CHES.

[8]  S.A. Manavski,et al.  CUDA Compatible GPU as an Efficient Hardware Accelerator for AES Cryptography , 2007, 2007 IEEE International Conference on Signal Processing and Communications.

[9]  John Waldron,et al.  Practical Symmetric Key Cryptography on Modern Graphics Hardware , 2008, USENIX Security Symposium.

[10]  Xiaowen Chu,et al.  Massively Parallel Network Coding on GPUs , 2008, 2008 IEEE International Performance, Computing and Communications Conference.

[11]  Ping Yao,et al.  A Program Behavior Study of Block Cryptography Algorithms on GPGPU , 2009, 2009 Fourth International Conference on Frontier of Computer Science and Technology.

[12]  Xiaowen Chu,et al.  Practical Random Linear Network Coding on GPUs , 2009, Networking.

[13]  Giovanni Agosta,et al.  Design of a parallel AES for graphics hardware using the CUDA framework , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[14]  Iwai Keisuke,et al.  Granularity Optimization Method for AES Encryption Implementation on CUDA , 2010 .

[15]  Jiming Liu,et al.  Speeding up K-Means Algorithm by GPUs , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[16]  Hai Jiang,et al.  CUDA-based AES parallelization with fine-tuned GPU memory utilization , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[17]  Takakazu Kurokawa,et al.  AES Encryption Implementation on CUDA GPU and Its Analysis , 2010, 2010 First International Conference on Networking and Computing.

[18]  Takakazu Kurokawa,et al.  High-Performance Symmetric Block Ciphers on CUDA , 2011, 2011 Second International Conference on Networking and Computing.