Re-architecting DRAM memory systems with monolithically integrated silicon photonics

The performance of future manycore processors will only scale with the number of integrated cores if there is a corresponding increase in memory bandwidth. Projected scaling of electrical DRAM architectures appears unlikely to suffice, being constrained by processor and DRAM pin-bandwidth density and by total DRAM chip power, including off-chip signaling, cross-chip interconnect, and bank access energy. In this work, we redesign the DRAM main memory system using a proposed monolithically integrated silicon photonics technology and show that our photonically interconnected DRAM (PIDRAM) provides a promising solution to all of these issues. Photonics can provide high aggregate pin-bandwidth density through dense wavelength-division multiplexing. Photonic signaling provides energy-efficient communication, which we exploit to not only reduce chip-to-chip interconnect power but to also reduce cross-chip interconnect power by extending the photonic links deep into the actual PIDRAM chips. To complement these large improvements in interconnect bandwidth and power, we decrease the number of bits activated per bank to improve the energy efficiency of the PIDRAM banks themselves. Our most promising design point yields approximately a 10x power reduction for a single-chip PIDRAM channel with similar throughput and area as a projected future electrical-only DRAM. Finally, we propose optical power guiding as a new technique that allows a single PIDRAM chip design to be used efficiently in several multi-chip configurations that provide either increased aggregate capacity or bandwidth.

[1]  R. Baets,et al.  Compact efficient broadband grating coupler for silicon-on-insulator waveguides. , 2004, Optics letters.

[2]  Ashok V. Krishnamoorthy,et al.  Challenges in building a flat-bandwidth memory hierarchy for a large-scale computer with proximity communication , 2005, 13th Symposium on High Performance Interconnects (HOTI'05).

[3]  Cary Gunn,et al.  CMOS Photonics for High-Speed Interconnects , 2006, IEEE Micro.

[4]  H. Kim,et al.  Performance Boosting of Peripheral Transistor for High Density 4Gb DRAM Technologies by SiGe Selective Epitaxial Growth Technique , 2006, 2006 International SiGe Technology and Device Meeting.

[5]  Alyssa B. Apsel,et al.  Leveraging Optical Technology in Future Bus-based Chip Multiprocessors , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[6]  Frederick A. Ware,et al.  Improving Power and Data Efficiency with Threaded Memory Modules , 2006, 2006 International Conference on Computer Design.

[7]  Vladimir Stojanovic,et al.  Silicon photonics for compact, energy-efficient interconnects [Invited] , 2007, Journal of Optical Networking.

[8]  Feng Lin,et al.  DRAM Circuit Design: Fundamental and High-Speed Topics , 2007 .

[9]  Rajeev J Ram,et al.  Localized substrate removal technique enabling strong-confinement microphotonics in bulk Si CMOS processes , 2008, 2008 Conference on Lasers and Electro-Optics and 2008 Conference on Quantum Electronics and Laser Science.

[10]  Hanqing Li,et al.  Demonstration of an electronic photonic integrated circuit in a commercial scaled bulk CMOS process , 2008, 2008 Conference on Lasers and Electro-Optics and 2008 Conference on Quantum Electronics and Laser Science.

[11]  S. J. Ben Yoo,et al.  OCDIMM: Scaling the DRAM Memory Wall Using WDM Based Optical Interconnects , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[12]  Jung Ho Ahn,et al.  Corona: System Implications of Emerging Nanophotonic Technology , 2008, 2008 International Symposium on Computer Architecture.

[13]  Gabriel H. Loh,et al.  3D-Stacked Memory Architectures for Multi-core Processors , 2008, 2008 International Symposium on Computer Architecture.

[14]  Ting Wu,et al.  A 16Gb/s/link, 64GB/s bidirectional asymmetric memory interface cell , 2008, 2008 IEEE Symposium on VLSI Circuits.

[15]  Jung Ho Ahn,et al.  A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies , 2008, 2008 International Symposium on Computer Architecture.

[16]  Jung Ho Ahn,et al.  Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs , 2009, IEEE Computer Architecture Letters.

[17]  So-Ra Kim,et al.  8Gb 3D DDR3 DRAM using through-silicon-via technology , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[18]  Ting Wu,et al.  A 16 Gb/s/Link, 64 GB/s Bidirectional Asymmetric Memory Interface , 2009, IEEE Journal of Solid-State Circuits.

[19]  Christopher Batten,et al.  Building Many-Core Processor-to-DRAM Networks with Monolithic CMOS Silicon Photonics , 2009, IEEE Micro.

[20]  Christopher Batten,et al.  Silicon-photonic clos networks for global on-chip communication , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[21]  Yu Zhang,et al.  Firefly: illuminating future network-on-chip with nanophotonics , 2009, ISCA '09.

[22]  Nanning Zheng,et al.  3D DRAM Design and Application to 3D Multicore Systems , 2009, IEEE Design & Test of Computers.

[23]  Goichi Ono,et al.  A 12.3-mW 12.5-Gb/s Complete Transceiver in 65-nm CMOS Process , 2010, IEEE Journal of Solid-State Circuits.

[24]  Bryan Casper,et al.  A 47$\,\times\,$ 10 Gb/s 1.4 mW/Gb/s Parallel Interface in 45 nm CMOS , 2010, IEEE Journal of Solid-State Circuits.

[25]  Goichi Ono,et al.  A 12.3mW 12.5Gb/s complete transceiver in 65nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[26]  Jae-Hyung Lee,et al.  A 7Gb/s/pin GDDR5 SDRAM with 2.5ns bank-to-bank active time and no bank-group restriction , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[27]  Norman P. Jouppi,et al.  Rethinking DRAM design and organization for energy-constrained multi-cores , 2010, ISCA.

[28]  Young-Hyun Jun,et al.  8 Gb 3-D DDR3 DRAM Using Through-Silicon-Via Technology , 2009, IEEE Journal of Solid-State Circuits.

[29]  Jae-Hyung Lee,et al.  A 7 Gb/s/pin 1 Gbit GDDR5 SDRAM With 2.5 ns Bank to Bank Active Time and No Bank Group Restriction , 2011, IEEE Journal of Solid-State Circuits.