As the data protection with encryption becomes important day by day, the encryption processing using General Purpose computation on a Graphic Processing Unit (GPGPU) has been noticed as one of the methods to realize high-speed data protection technology. GPUs have evolved in recent years into powerful parallel computing devices, with a high cost-performance ratio. However, many factors affect GPU performance. In earlier work to gain higher AES performance using GPGPU in various ways, we obtained the following two technical viewpoints: (1) 16 Bytes/Thread is the best granularity (2) Extended key and substitution table stored in shared memory and plaintext stored in register are the best memory allocation style. However, AES is not the only cipher algorithm widely used in the real world. For this reason, this study was undertaken to test the hypothesis that these two findings are applicable to implementation of other symmetric block ciphers on two generation of GPU. In this study, we targeted five 128-bit symmetric block ciphers, AES, Camellia, CIPHERUNICORN-A, Hierocrypt-3, and SC2000, from an e-government recommended ciphers list by the CRYPTography Research and Evaluation Committees (CRYPTREC) in Japan. We evaluated the performance of these five symmetric block ciphers on the machine including a 4-core CPU and each GPU using three method: (A) throughput without data transfer, (B) throughput with data transfer and overlapping encryption processing on GPU, (C) throughput with data transfer and non-overlapping encryption processing on GPU. Results demonstrate that the throughput of implementation of SC2000 in method (A) on Tesla C2050 achieved extremely high 73.4 Gbps. Additionally, the throughput obtained using methods (B) and (C) deteriorated to 33.4 Gbps and 18.3 Gbps, respectively. Method (B) showed effective throughput with an approximately 4.7 times higher speed compared to that obtained when using 8 threads on a 4-core CPU.
[1]
Vincent Rijmen,et al.
The Design of Rijndael: AES - The Advanced Encryption Standard
,
2002
.
[2]
Takakazu Kurokawa,et al.
AES Encryption Implementation on CUDA GPU and Its Analysis
,
2010,
2010 First International Conference on Networking and Computing.
[3]
Giovanni Agosta,et al.
Design of a parallel AES for graphics hardware using the CUDA framework
,
2009,
2009 IEEE International Symposium on Parallel & Distributed Processing.
[4]
Ping Yao,et al.
A Program Behavior Study of Block Cryptography Algorithms on GPGPU
,
2009,
2009 Fourth International Conference on Frontier of Computer Science and Technology.
[5]
Eli Biham,et al.
A Fast New DES Implementation in Software
,
1997,
FSE.
[6]
Vincent Rijmen,et al.
The Design of Rijndael
,
2002,
Information Security and Cryptography.
[7]
Kazuhiro Yokoyama,et al.
The Block Cipher SC2000
,
2001,
FSE.
[8]
Yukiyasu Tsunoo.
128bit block cipher-cipherunicorn-a
,
2002
.
[9]
Kenji Ohkuma,et al.
The Block Cipher Hierocrypt
,
2000,
Selected Areas in Cryptography.
[10]
James Demmel,et al.
Benchmarking GPUs to tune dense linear algebra
,
2008,
HiPC 2008.