Workload characterization of cryptography algorithms for hardware acceleration

Data encryption/decryption has become an essential component for modern information exchange. However, executing these cryptographic algorithms is often associated with huge overhead and the need to reduce this overhead arises correspondingly. In this paper, we select nine widely adopted cryptography algorithms and study their workload characteristics. Different from many previous works, we consider the overhead not only from the perspective of computation but also focusing on the memory access pattern. We break down the function execution time to identify the software bottleneck suitable for hardware acceleration. Then we categorize the operations needed by these algorithms. In particular, we introduce a concept called 'Load-Store Block' (LSB) and perform LSB identification of various algorithms. Our results illustrate that for cryptographic algorithms, the execution rate of most hotspot functions is more than 60%; memory access instruction ratio is mostly more than 60%; and LSB instructions account for more than 30% for selected benchmarks. Based on our findings, we suggest future directions in designing either the hardware accelerator associated with microprocessor or specific microprocessor for cryptography applications.

[1]  Donald W. Davies,et al.  Security for computer networks - an introduction to data security in teleprocessing and electronic funds transfer (2. ed.) , 1989, Wiley series in communication and distributed systems.

[2]  Patrick Schaumont,et al.  Embedded software integration for coarse-grain reconfigurable systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  John Waldron,et al.  AES Encryption Implementation and Analysis on Commodity Graphics Processing Units , 2007, CHES.

[4]  I. Verbauwhede,et al.  Interfacing a high speed crypto accelerator to an embedded CPU , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[5]  Michael Wiener,et al.  Advances in Cryptology — CRYPTO’ 99 , 1999 .

[6]  Charles Cresson Wood,et al.  Security for computer networks : D.W. Davies and W.L. Price New York: John Wiley and Sons, 1984. 386 + xix pages, $19.50 , 1985, Computers & security.

[7]  Mohamed Khalil-Hani,et al.  An AES Tightly Coupled Hardware Accelerator in an FPGA-based Embedded Processor Core , 2009, 2009 International Conference on Computer Engineering and Technology.

[8]  Franc Novak,et al.  HARDWARE IMPLEMENTATION OF AES ALGORITHM , 2005 .

[9]  Craig S. K. Clapp,et al.  Instruction-level Parallelism in AES Candidates , 1999 .

[10]  Michael Luby,et al.  How to Construct Pseudo-Random Permutations from Pseudo-Random Functions (Abstract) , 1986, CRYPTO.

[11]  Andrew Bunnie Huang,et al.  Hacking the Xbox: An Introduction to Reverse Engineering , 2003 .

[12]  A. Murat Fiskiran,et al.  Workload characterization of elliptic curve cryptography and other network security algorithms for constrained environments , 2002, 2002 IEEE International Workshop on Workload Characterization.

[13]  Dariusz Burak,et al.  Parallelization of the IDEA Algorithm , 2004, International Conference on Computational Science.

[14]  Ronald L. Rivest,et al.  The RC5 Encryption Algorithm , 1994, FSE.

[15]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[16]  Johann Großschädl,et al.  An Instruction Set Extension for Fast and Memory-Efficient AES Implementation , 2005, Communications and Multimedia Security.

[17]  Xuejia Lai,et al.  On the design and security of block ciphers , 1992 .

[18]  Victor S. Miller,et al.  Use of Elliptic Curves in Cryptography , 1985, CRYPTO.

[19]  B. Ramakrishna Rau,et al.  Instruction-level Parallelism , 2001 .

[20]  Ahmed Bouridane,et al.  AES Embedded Hardware Implementation , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).

[21]  Johann Großschädl,et al.  Instruction Set Extensions for Efficient AES Implementation on 32-bit Processors , 2006, CHES.

[22]  Wlodzimierz Bielecki,et al.  Parallelization Method of Encryption Algorithms , 2007, Advances in Information Processing and Protection.

[23]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.

[24]  N. Koblitz Elliptic curve cryptosystems , 1987 .

[25]  T. Austin,et al.  Architectural support for fast symmetric-key cryptography , 2000, ASPLOS IX.

[26]  Guido Bertoni,et al.  Speeding Up AES By Extending a 32 bit Processor Instruction Set , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).