RNS-Based Embedded Processor Design

In this chapter a unified architecture providing a generic, programmable, and scalable RNS computation based on {2 n ± k i } moduli channels is described. This architecture allows for the design of RNS with an arbitrarily long moduli set of the form {2 n ± k0, ⋯ , 2 n ± k j }, with \(j \in \mathbb{N}_{0}^{+}\). The considered moduli set allows to arbitrarily increase the number of RNS channels and consequently increasing the Dynamic Range (DR) or reducing the width of the channels, leading to a reduction in delay and area cost of the arithmetic operations, allowing to further exploit the RNS parallelism. The proposed RNS architecture provides not only a programmable processor capable of supporting a wide range of algorithms using RNS, but also a tool for researchers to evaluate new algorithms, moduli sets, and conversion approaches.

[1]  Adrian Philip Wise,et al.  The design and implementation of the IMS A110 image and signal processor , 1989, 1989 Proceedings of the IEEE Custom Integrated Circuits Conference.

[2]  W. A. Chren RNS-based enhancements for direct digital frequency synthesis , 1995 .

[3]  G.C. Cardarilli,et al.  Residue Number System for Low-Power DSP Applications , 2007, 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers.

[4]  Timothy J. Slegel,et al.  Design and performance of the IBM Enterprise System/9000 Type 9121 Vector Facility , 1991, IBM J. Res. Dev..

[5]  Reinhard Posch,et al.  Modulo Reduction in Residue Number Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[6]  Ming-Hwa Sheu,et al.  An efficient VLSI design for a residue to binary converter for general balance moduli (2n-3, 2n+1, 2n-1, 2n+3) , 2004, IEEE Trans. Circuits Syst. II Express Briefs.

[7]  A. Benjamin Premkumar,et al.  A Memoryless Reverse Converter for the 4-Moduli Superset {2n-1, 2n, 2n+1, 2n+1-1} , 2000, J. Circuits Syst. Comput..

[8]  Atsushi Shimbo,et al.  Implementation of RSA Algorithm Based on RNS Montgomery Multiplication , 2001, CHES.

[9]  Chip-Hong Chang,et al.  A Residue-to-Binary Converter for a New Five-Moduli Set , 2007, IEEE Transactions on Circuits and Systems I: Regular Papers.

[10]  Ghassem Jaberipur,et al.  A ROM-less reverse RNS converter for moduli set {2q±1, 2q±3} , 2014, IET Comput. Digit. Tech..

[11]  Laurent Imbert,et al.  Leak Resistant Arithmetic , 2004, CHES.

[12]  Chin-Liang Wang New bit-serial VLSI implementation of RNS FIR digital filters , 1994 .

[13]  Chip-Hong Chang,et al.  An efficient reverse converter for the 4-moduli set {2/sup n/ - 1, 2/sup n/, 2/sup n/ + 1, 2/sup 2n/ + 1} based on the new Chinese remainder theorem , 2003 .

[14]  Jizeng Wei,et al.  A Unified Cryptographic Processor for RSA and ECC in RNS , 2013, NCCET.

[15]  Yuke Wang Residue-to-binary converters based on new Chinese remainder theorems , 2000 .

[16]  Jean-Claude Bajard,et al.  An RNS Montgomery modular multiplication algorithm , 1997, Proceedings 13th IEEE Sympsoium on Computer Arithmetic.

[17]  Ricardo Chaves,et al.  RDSP: a RISC DSP based on residue number system , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..

[18]  Leonel Sousa,et al.  RNS-Based Elliptic Curve Point Multiplication for Massive Parallel Architectures , 2012, Comput. J..

[19]  P. V. Anandha Mohan Reverse converters for the moduli sets {2/sup 2N/-1, 2/sup N/, 2/sup 2N/+1} and {2/sup N/-3, 2/sup N/+1, 2/sup N/-1, 2/sup N/+3} , 2004 .

[20]  Nicolas Guillermin A coprocessor for secure and high speed modular arithmetic , 2011, IACR Cryptol. ePrint Arch..

[21]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[22]  Nicolas Guillermin A High Speed Coprocessor for Elliptic Curve Scalar Multiplications over \mathbbFp\mathbb{F}_p , 2010, CHES.

[23]  Leonel Sousa,et al.  An Efficient Scalable RNS Architecture for Large Dynamic Ranges , 2014, J. Signal Process. Syst..

[24]  Jean-Claude Bajard,et al.  An RNS Montgomery Modular Multiplication Algorithm , 1998, IEEE Trans. Computers.

[25]  Jean-Claude Bajard,et al.  Modular multiplication and base extensions in residue number systems , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[26]  Jizeng Wei,et al.  Hardware architecture for RSA cryptography based on residue number system , 2012 .

[27]  Ricardo Chaves,et al.  Arithmetic Units for RNS Moduli {2n-3} and {2n+3} Operations , 2010, 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools.

[28]  D. Marpe,et al.  Video coding with H.264/AVC: tools, performance, and complexity , 2004, IEEE Circuits and Systems Magazine.

[29]  Laurent Imbert,et al.  a full RNS implementation of RSA , 2004, IEEE Transactions on Computers.

[30]  Atsushi Shimbo,et al.  Cox-Rower Architecture for Fast Parallel Montgomery Multiplication , 2000, EUROCRYPT.

[31]  P. Mohan New reverse converters for the moduli set {2n-3,2n-1,2n+1,2n+3} , 2008 .

[32]  Ricardo Chaves,et al.  Arithmetic-Based Binary-to-RNS Converter Modulo ${\{2^{n}{\pm}k\}}$ for $jn$ -bit Dynamic Range , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[33]  Ricardo Chaves,et al.  RNS Arithmetic Units for Modulo {2^n+-k} , 2012, 2012 15th Euromicro Conference on Digital System Design.

[34]  Hilarie K. Orman,et al.  Fast Key Exchange with Elliptic Curve Systems , 1995, CRYPTO.

[35]  Ricardo Chaves,et al.  RNS Reverse Converters for Moduli Sets With Dynamic Ranges up to $(8n+1)$ -bit , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.

[36]  Thanos Stouraitis,et al.  Design of a balanced 8-modulus RNS , 2009, 2009 16th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2009).

[37]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[38]  Francesco Piazza,et al.  Fast Combinatorial RNS Processors for DSP Applications , 1995, IEEE Trans. Computers.

[39]  Chip-Hong Chang,et al.  An Efficient Reverse Converter for the 4-Moduli Set 2 n 1 , 2 n , 2 n + 1 , 22 n + 1 Based on the New Chinese Remainder Theorem , 2001 .

[40]  Partha Garai,et al.  RNS based reconfigurable processor for high speed signal processing , 2014, TENCON 2014 - 2014 IEEE Region 10 Conference.

[41]  Leonel Sousa,et al.  An RNS based Specific Processor for Computing the Minimum Sum-of-Absolute-Differences , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.