Architectural Enhancements to Support Digital Signal Processing and Public-Key Cryptography

In recent years, every major micro-processor architecture was extended by a number of special instructions to accelerate the processing of DSP or multimedia workloads. Even simple processors developed for the embedded systems field are nowadays equipped with fast multiply/accumulate (MAC) units to provide greater performance in processing DSP/multimedia kernels. In the present paper, we investigate the suitability of these architectural enhancements to speed up arithmetic operations used in public-key cryptography, most notably multiple-precision modular multiplication. We analyze different algorithms for modular arithmetic and discuss how these algorithms can take advantage of the fast MAC units that are present in various RISC cores based on the MIPS32 and ARMv5TE architecture, respectively. Furthermore, we compare architectural enhancements and instruction set extensions specifically designed to accelerate long integer arithmetic. Our analysis shows that the MIPS32 architecture can be easily extended for efficient cryptography processing and offers some advantages compared to the ARMv5TE architecture.