Architectural support for fast symmetric-key cryptography

The emergence of the Internet as a trusted medium for commerce and communication has made cryptography an essential component of modern information systems. Cryptography provides the mechanisms necessary to implement accountability, accuracy, and confidentiality in communication. As demands for secure communication bandwidth grow, efficient cryptographic processing will become increasingly vital to good system performance.In this paper, we explore techinques to improve the performance of symmetric key cipher algorithms. Eight popular strong encryption algorithms are examined in detail. Analysis reveals the algorithms are computaionally complex and contain little parallelism. Overall throughput on high-end microprocessor is quite poor, a 600 Mhz processor is incapable of saturation a T3 communication line with 3DES (triple DES) encrypted data.We introduce new instructions taht improve the efficiency of the analyzed algorithms. Our approach adds instruction set support for fast substitutions, general permutations, rotates, and modular arithmetic. Performance analysis of the optimized ciphers shows an overall speedup of 59% over a baseline machine with rotate instructions and 74% speedup over a baseline without rotates. Even higher speedups are demonstrated with optimized subtitutions (SBOXes) and additional functional unit resources. our analyses of the original and optimized algorithms suggest future directions for the design of high-performance programmable cryptographic processors.

[1]  Cheng-Wen Wu,et al.  An improved Montgomery's algorithm for high-speed RSA public-key cryptosystem , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[2]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[3]  Xuejia Lai,et al.  On the design and security of block ciphers , 1992 .

[4]  Wei-Chang Tsai,et al.  Two systolic architectures for modular multiplication , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[5]  Thomas Blum,et al.  Montgomery modular exponentiation on reconfigurable hardware , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).

[6]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[7]  Ronald M. Smith,et al.  ESA/390 Integrated Cryptographic Facility: An Overview , 1991, IBM Syst. J..

[8]  Randall J. Atkinson,et al.  Security Architecture for the Internet Protocol , 1995, RFC.

[9]  Mikko H. Lipasti,et al.  Exceeding the dataflow limit via value prediction , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[10]  Cheng-Wen Wu,et al.  Radix-4 modular multiplication and exponentiation algorithms for the RSA public-key cryptosystem , 2000, Proceedings 2000. Design Automation Conference. (IEEE Cat. No.00CH37106).

[11]  Charles Cresson Wood,et al.  Security for computer networks : D.W. Davies and W.L. Price New York: John Wiley and Sons, 1984. 386 + xix pages, $19.50 , 1985, Computers & security.

[12]  Perry B. Gentry What is a VPN? , 2001, Inf. Secur. Tech. Rep..

[13]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[14]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[15]  Ruby B. Lee,et al.  Bit permutation instructions for accelerating software cryptography , 2000, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors.

[16]  Donald W. Davies,et al.  Security for computer networks - an introduction to data security in teleprocessing and electronic funds transfer (2. ed.) , 1989, Wiley series in communication and distributed systems.

[17]  Ronald M. Smith,et al.  S/390 CMOS Cryptographic Coprocessor Architecture: Overview and design considerations , 1999, IBM J. Res. Dev..