High-Performance, Cost-Effective 3D Stacked Wide-Operand Adders

Through-Silicon Vias (TSV) based 3D Stacked IC (3D-SIC) technology introduces new design opportunities for wide operand width addition units. Different from state of the art direct folding proposals we introduce two cost-effective 3D Stacked Hybrid Adders with identical tier structure, which potentially makes the manufacturing of hardware wide-operand fast adders a reality. An <inline-formula><tex-math notation="LaTeX">$N$</tex-math><alternatives> <inline-graphic xlink:href="voicu-ieq1-2598290.gif"/></alternatives></inline-formula>-bit adder implemented on a <inline-formula><tex-math notation="LaTeX">$K$</tex-math><alternatives> <inline-graphic xlink:href="voicu-ieq2-2598290.gif"/></alternatives></inline-formula> identical tier stacked IC performs in parallel two <inline-formula><tex-math notation="LaTeX">$N/K$</tex-math><alternatives> <inline-graphic xlink:href="voicu-ieq3-2598290.gif"/></alternatives></inline-formula>-bit additions on each tier according to the anticipated computation principle. Inter-tier carry signals performing the appropriate sum selection are propagated by TSVs. The practical implications of direct folding and of our hybrid carry-select/prefix approaches are evaluated by a thorough case study on 65 nm CMOS 3D adder implementations, for operand sizes up to 4,096 bits and 16 tiers. Our simulations indicate that in almost all configurations at least one of the two proposed 3D stacked hybrid approaches is faster than the fastest 3D folding approach. When considering an appropriate metric for 3D designs, i.e., the delay-footprint-heterogeneity product, the hybrid adders substantially outperform the folding counterparts by a factor in-between <inline-formula><tex-math notation="LaTeX">$1.67\times$</tex-math><alternatives> <inline-graphic xlink:href="voicu-ieq4-2598290.gif"/></alternatives></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$23.95\times$</tex-math><alternatives><inline-graphic xlink:href="voicu-ieq5-2598290.gif"/> </alternatives></inline-formula>.

[1]  Mircea Vladutiu,et al.  Computer Arithmetic , 2012, Springer Berlin Heidelberg.

[2]  John U. Knickerbocker,et al.  An overview of through-silicon-via technology and manufacturing challenges , 2015 .

[3]  Sorin Cotofana,et al.  3D stacked wide-operand adders: A case study , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.

[4]  K.F. Yang,et al.  TSV process optimization for reduced device impact on 28nm CMOS , 2011, 2011 Symposium on VLSI Technology - Digest of Technical Papers.

[5]  Sung Kyu Lim,et al.  On enhancing power benefits in 3D ICs: Block folding and bonding styles perspective , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Sung Kyu Lim,et al.  Slew-aware buffer insertion for through-silicon-via-based 3D ICs , 2012, Proceedings of the IEEE 2012 Custom Integrated Circuits Conference.

[7]  Gabriel H. Loh,et al.  The impact of 3-dimensional integration on the design of arithmetic units , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[8]  Tao Zhang,et al.  Arithmetic unit design using 180nm TSV-based 3D stacking technology , 2009, 2009 IEEE International Conference on 3D System Integration.

[9]  Tao Li,et al.  Microarchitecture soft error vulnerability characterization and mitigation under 3D integration technology , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[10]  Ming-Der Shieh,et al.  A High-Performance Unified-Field Reconfigurable Cryptographic Processor , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[11]  Qing Li,et al.  A low-cost cryptographic processor for security embedded system , 2008, 2008 Asia and South Pacific Design Automation Conference.

[12]  M.D. Ercegovac,et al.  Effect of wire delay on the design of prefix adders in deep-submicron technology , 2000, Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154).

[13]  Orest J. Bedrij Carry-Select Adder , 1962, IRE Trans. Electron. Comput..

[14]  Subarna Sinha,et al.  The road to 3D EDA tool readiness , 2009, 2009 Asia and South Pacific Design Automation Conference.

[15]  H. T. Kung,et al.  A Regular Layout for Parallel Adders , 1982, IEEE Transactions on Computers.

[16]  Narayanan Vijaykrishnan,et al.  Architecting Microprocessor Components in 3D Design Space , 2007, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07).

[17]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[18]  Peter Ramm,et al.  Handbook of 3D integration : technology and applications of 3D integrated circuits , 2012 .

[19]  Earl E. Swartzlander,et al.  Computer Arithmetic , 1980 .

[20]  W. Marsden I and J , 2012 .

[21]  Hiroshi Takahashi,et al.  A 1/4-inch 8Mpixel back-illuminated stacked CMOS image sensor , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[22]  Kyung Whan Kim,et al.  18.3 A 1.2V 64Gb 8-channel 256GB/s HBM DRAM with peripheral-base-die architecture and small-swing technique on heavy load interface , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[23]  Arjen K. Lenstra,et al.  Factorization of a 768-Bit RSA Modulus , 2010, CRYPTO.

[24]  Sung Kyu Lim,et al.  Through-silicon-via-aware delay and power prediction model for buffered interconnects in 3D ICs , 2010, SLIP '10.

[25]  Jonathan Katz,et al.  Introduction to Modern Cryptography: Principles and Protocols , 2007 .

[26]  Mathias Beike,et al.  Digital Integrated Circuits A Design Perspective , 2016 .

[27]  Roger Fabian W. Pease,et al.  Lithography and Other Patterning Techniques for Future Electronics , 2008, Proceedings of the IEEE.

[28]  Chih-Tsun Huang,et al.  Energy-Adaptive Dual-Field Processor for High-Performance Elliptic Curve Cryptographic Applications , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[29]  Harold S. Stone,et al.  A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations , 1973, IEEE Transactions on Computers.

[30]  Earl E. Swartzlander,et al.  A Spanning Tree Carry Lookahead Adder , 1992, IEEE Trans. Computers.

[31]  Sorin Cotofana,et al.  Zero-performance-overhead online fault detection and diagnosis in 3D stacked integrated circuits , 2012, 2012 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH).