Fast modular multiplication with carry save adder

An implementation algorithm for modular multiplication is proposed based on Koc's sign estimation technique. The main computation for the algorithm is divided into two parallel parts, i.e. computing 2T mod N and computing P=P+a/sub j/T (mod N). Since the two parts use the sign estimation technique, the n-bit addition carry chain can be reduced to 4-bit addition. This makes the circuit clock frequency high. One n-bit modular multiplication is completed every n+1 clock cycles. For n-bit exponent and n-bit modulus, the time and hardware cost are 1.5n(n+1) clock cycles and 20n gates count with CSA technique, respectively.