A 10.35 mW/GFlop stacked SAR DSP unit using fine-grain partitioned 3D integration

In this paper we present a technique for implementing a fine-grain partitioned three-dimensional SAR DSP system using 3D placement of standard cells where only one of the 3D tiers is clocked to reduce clock power. We show how this technique was used to build the first fine-grain partitioned 3D integrated system to be demonstrated with silicon measurements in the literature, which is an ultra efficient floating-point synthetic aperture radar (SAR) DSP processing unit. The processing unit was fabricated in two tiers of GlobalFoundries, 1.5 V 130nm process that were 3D stacked face-to-face by Tezzaron. After fabrication the test chip was measured to consume 4.14 mW of power while running at 40 MHz operating for an operating efficiency of 10.35 mW/GFlop.

[1]  Ravi Jenkal,et al.  Inter-die signaling in three dimensional integrated circuits , 2008, 2008 IEEE Custom Integrated Circuits Conference.

[2]  Robert S. Patti Three-Dimensional Integrated Circuits and the Future of System-on-Chip Designs In 3D integrated circuits, analog, digital, flash and DRAM wafers are processed separately, then brought together in an integrated vertical stack. , 2006 .

[3]  Jason Cong,et al.  Investigating the effects of fine-grain three-dimensional integration on microarchitecture design , 2008, JETC.

[4]  Jason Cong,et al.  A multilevel analytical placement for 3D ICs , 2009, 2009 Asia and South Pacific Design Automation Conference.

[5]  Hoi-Jun Yoo,et al.  A 28.5mW 2.8GFLOPS floating-point multifunction unit for handheld 3D graphics processors , 2007, 2007 IEEE Asian Solid-State Circuits Conference.

[6]  S.H. Dhong,et al.  A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor , 2006, IEEE Journal of Solid-State Circuits.

[7]  Paul D. Franzon,et al.  Reconfigurable five-layer three-dimensional integrated memory-on-logic synthetic aperture radar processor , 2011, IET Comput. Digit. Tech..

[8]  Guilherme Flach,et al.  Quadratic placement for 3d circuits using z-cell shifting, 3d iterative refinement and simulated annealing , 2006, SBCCI '06.

[9]  Hsien-Hsin S. Lee,et al.  Design and analysis of 3D-MAPS: A many-core 3D processor with stacked memory , 2010, IEEE Custom Integrated Circuits Conference 2010.

[10]  Fumio Arakawa,et al.  An embedded processor core for consumer appliances with 2.8GFLOPS and 36M polygons/s FPU , 2004 .

[11]  A. Alvandpour,et al.  A 6.2-GFlops Floating-Point Multiply-Accumulator With Conditional Normalization , 2006, IEEE Journal of Solid-State Circuits.

[12]  Sachin S. Sapatnekar,et al.  Efficient Thermal Placement of Standard Cells in 3D ICs using a Force Directed Approach , 2003, ICCAD.

[13]  Yuan Xie,et al.  Design space exploration for 3D architectures , 2006, JETC.

[14]  M. Yamaoka,et al.  A powerful yet ecological parallel processing system using execution-based adaptive power-down control and compact quadruple-precision assist FPUs , 2008, 2008 IEEE Symposium on VLSI Circuits.

[15]  S.H. Dhong,et al.  A fully-pipelined single-precision floating point unit in the synergistic processor element of a CELL processor , 2005, Digest of Technical Papers. 2005 Symposium on VLSI Circuits, 2005..

[16]  Jason Cong,et al.  Fine grain 3D integration for microarchitecture design through cube packing exploration , 2007, 2007 25th International Conference on Computer Design.

[17]  Robert S. Patti,et al.  Three-Dimensional Integrated Circuits and the Future of System-on-Chip Designs , 2006, Proceedings of the IEEE.

[18]  W. Maly,et al.  2.5D system integration: a design driven system implementation schema , 2004 .

[19]  Tao Zhang,et al.  A customized design of DRAM controller for on-chip 3D DRAM stacking , 2010, IEEE Custom Integrated Circuits Conference 2010.

[20]  Evangeline F. Y. Young,et al.  Fixed-outline thermal-aware 3D floorplanning , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[21]  Hsien-Hsin S. Lee,et al.  3D-MAPS: 3D Massively parallel processor with stacked memory , 2012, 2012 IEEE International Solid-State Circuits Conference.

[22]  Sung Kyu Lim,et al.  Block-level 3D IC design with through-silicon-via planning , 2012, 17th Asia and South Pacific Design Automation Conference.

[23]  Jason Cong,et al.  A thermal-driven floorplanning algorithm for 3D ICs , 2004, ICCAD 2004.

[24]  A. Chandrakasan,et al.  Design tools for 3-D integrated circuits , 2003, Proceedings of the ASP-DAC Asia and South Pacific Design Automation Conference, 2003..

[25]  Jason Cong,et al.  Thermal-Aware 3D IC Placement Via Transformation , 2007, 2007 Asia and South Pacific Design Automation Conference.