DRC2: Dynamically Reconfigurable Computing Circuit based on memory architecture

This paper presents a novel energy-efficient and Dynamically Reconfigurable Computing Circuit (DRC2) concept based on memory architecture for data-intensive (imaging, ...) and secure (cryptography, ...) applications. The proposed computing circuit is based on a 10-Transistor (10T) 3-Port SRAM bitcell array driven by a peripheral circuitry enabling all basic operations that can be traditionally performed by an ALU. As a result, logic and arithmetic operations can be entirely executed within the memory unit leading to a significant reduction in power consumption related to the data transfer between memories and computing units. Moreover, the proposed computing circuit can perform extremely-parallel operations enabling the processing of large volume of data. A test case based on image processing application and using the saturating increment function is analytically modeled to compare conventional and DRC2-based approaches. It is demonstrated that DRC2-based approach provides a reduction of clock cycle number of up to 2×. Finally, potential applications and must-be-considered changes at different design levels are discussed.

[1]  Sally A. McKee,et al.  Hitting the memory wall: implications of the obvious , 1995, CARN.

[2]  Wael M. Badawy,et al.  Video-Active RAM: A processor-in-memory architecture for video coding applications , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[3]  Dave Brown,et al.  Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .

[4]  David Blaauw,et al.  A 28 nm Configurable Memory (TCAM/BCAM/SRAM) Using Push-Rule 6T Bit Cell Enabling Logic-in-Memory , 2016, IEEE Journal of Solid-State Circuits.

[5]  Uri C. Weiser,et al.  Memristor-Based Material Implication (IMPLY) Logic: Design Principles and Methodologies , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Mark Horowitz,et al.  1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[7]  Shen-Fu Hsiao,et al.  Design of low-leakage multi-port SRAM for register file in graphics processing unit , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[8]  W. Daniel Hillis,et al.  Connection Machine Lisp: fine-grained parallel symbolic processing , 1986, LFP '86.

[9]  Maurice V. Wilkes,et al.  The memory wall and the CMOS end-point , 1995, CARN.

[10]  Christian Laugier,et al.  Bayesian Occupancy Filtering for Multitarget Tracking: An Automotive Application , 2006, Int. J. Robotics Res..

[11]  S. Devadas,et al.  Intelligent SRAM ( ISRAM ) for Improved Embedded System Performance , 2003 .

[12]  Christian Laugier,et al.  Multi-sensor fusion of occupancy grids based on integer arithmetic , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Yesh Kolla,et al.  A 45nm CMOS 13-port 64-word 41b fully associative content-addressable register file , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[14]  Henk Corporaal,et al.  Memristor based computation-in-memory architecture for data-intensive applications , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Franz Franchetti,et al.  A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing , 2013, 2013 IEEE International 3D Systems Integration Conference (3DIC).

[16]  Kaushik Roy,et al.  Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Maya Gokhale,et al.  Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.

[18]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.